Skip to content

Commit

Permalink
feat: add face task
Browse files Browse the repository at this point in the history
  • Loading branch information
tychenjiajun committed Sep 17, 2024
1 parent e02b156 commit f2ee9cc
Show file tree
Hide file tree
Showing 9 changed files with 602 additions and 54 deletions.
74 changes: 57 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

[![NPM Downloads](https://img.shields.io/npm/dw/exif-ai)](https://www.npmjs.com/package/exif-ai)

_Read this in other languages:_
[_简体中文_](README.zh-CN.md),

## About

_Exif AI_ is a powerful CLI tool designed to write AI-generated image descriptions and/or tags directly into the metadata of image files. This tool leverages advanced AI models to analyze image content and generate descriptive metadata, enhancing the accessibility and searchability of your images.
Expand Down Expand Up @@ -30,11 +33,11 @@ exif-ai -i example.jpeg -a ollama

Required options:

- `-a, --api-provider <value>` Name of the AI provider to use (`ollama` for Ollama, `zhipu` for ZhipuAI, `google` for Google Gemini).
- `-a, --api-provider <value>`: Name of the AI provider to use (`ollama` for Ollama, `zhipu` for ZhipuAI, `google` for Google Gemini).

Optional options:

- `-T, --tasks <tasks...>`: List of tasks to perform ('description' and/or 'tag').
- `-T, --tasks <tasks...>`: List of tasks to perform ('description', 'tag', 'face').
- `-i, --input <file>` Path to the input image file.
- `-p, --description-prompt <text>`: Custom prompt for the AI provider to generate description. Defaults to a generic image description prompt.
- `--tag-prompt <text>`: Custom prompt for the AI provider to generate tags. Defaults to a generic image tagging prompt.
Expand All @@ -49,6 +52,7 @@ Optional options:
- `--avoid-overwrite`: Avoid overwriting if EXIF tags already exist in the file.
- `--ext <extensions...>`: File extensions to watch. Only files with this extensions will be processed.
- `--concurrency <number>`: The numbers of files to process concurrently in watch mode.
- `--face-group-ids <group...>` List of face group IDs to use for face recognition.

Example usage:

Expand All @@ -64,24 +68,25 @@ To use Exif AI as a library in your project, import it and use the provided func
import { execute } from "exif-ai";

const options = {
path: "example.jpeg", // Path to the input image file
provider: "ollama", // AI provider to use (e.g., 'ollama', 'zhipu', 'google')
model: "moondream", // Optional: Specific AI model to use (if supported by the provider)
tasks: ["description"], // List of tasks to perform
path: "example.jpg", // Path to the input image file
provider: "ollama", // Name of the AI provider to use
descriptionTags: [
"XPComment",
"Description",
"ImageDescription",
"Caption-Abstract",
], // Optional: EXIF tags to write the description to
tagTags: ["Subject", "TagsList", "Keywords"], // Optional: EXIF tags to write the tags to
descriptionPrompt: "请使用中文描述这个图片。", // Optional: Custom prompt for the AI provider to generate description
tagPrompt:
"Tag this image based on subject, object, event, place. Output format: <tag1>, <tag2>, <tag3>, <tag4>, <tag5>, ..., <tagN>", // Optional: Custom prompt for the AI provider to generate tags
verbose: false, // Optional: Enable verbose logging for debugging
dry: false, // Optional: Perform a dry run without writing to the file
writeArgs: [], // Optional: Additional arguments for EXIF write task
providerArgs: [], // Optional: Additional arguments for the AI provider
avoidOverwrite: true, // Optional: Avoid overwriting existing tags
], // List of EXIF tags to write the description to
tagTags: ["Subject", "TagsList", "Keywords"], // List EXIF tags to write the tags to
descriptionPrompt: "Describe this landscape photo.", // Custom prompt for the AI provider to generate description
tagPrompt: "Tag this image based on subject, object, event, place.", // Custom prompt for the AI provider to generate tags
verbose: false, // Enable verbose output for debugging
dry: false, // Preview AI-generated content without writing to the image file
writeArgs: [], // Additional ExifTool arguments for writing metadata
providerArgs: [], // Additional arguments for the AI provider
avoidOverwrite: false, // Avoid overwriting if EXIF tags already exist in the file
doNotEndExifTool: false, // Do not end ExifTool process after writing metadata
faceGroupIds: [], // List of face group IDs to use for face recognition
};

execute(options)
Expand All @@ -101,9 +106,34 @@ To install Exif AI globally, use the following command:
npm install -g exif-ai
```

## Tasks

### Description

The `description` task generates a description of the image using the AI provider. The description is written to the specified EXIF tags defined in `descriptionTags`.

### Tag

The `tag` task generates tags for the image using the AI provider. The tags are written to the specified EXIF tags defined in `tagTags`.

### Face Recognition

The `face` task performs face recognition on the image using the [Tencent Cloud API](https://cloud.tencent.com/document/api/867/44994). The face recognition results are written to the specified EXIF tags defined in `tagTags`.

Currently, the `face` task requires user to enable face recognition service on Tencent Cloud and set a pair of Tencent Cloud API Secret ID and Tencent CLoud API Secret Key in the environment variable.

```bash
export TENCENTCLOUD_SECRET_ID=your_tencentcloud_secret_id
export TENCENTCLOUD_SECRET_KEY=your_tencentcloud_secret_key
```

### Note

Please ensure that you securely manage your API keys. Do not expose them in public repositories or other public forums.

## API Providers

Exif AI relies on API providers to generate image descriptions. Currently, we support three providers: ZhipuAI, Ollama and Google Gemini.
Exif AI relies on API providers to generate image descriptions and tags. Currently, we support three providers: ZhipuAI, Ollama and Google Gemini.

### Supported Providers

Expand Down Expand Up @@ -137,6 +167,16 @@ export API_KEY=your_google_api_key

Ollama runs locally and does not require an API key. Ensure that Ollama is installed and properly configured on your machine. Refer to the [Ollama GitHub repository](https://github.com/ollama/ollama) for installation and setup instructions.

To use remote Ollama service, you can defined the url in providerArgs:

```bash
exif-ai --providerArgs "http://ollama.example.com:8080" -a ollama -i image.jpg
```

```js
providerArgs: ["http://ollama.example.com:8080"],
```

## Develop

### Prerequisites
Expand All @@ -151,7 +191,7 @@ First, clone the Exif AI repository to your local machine:
```bash
git clone https://github.com/tychenjiajun/exif-ai.git
cd exif-ai
````
```

### Install Dependencies

Expand Down
205 changes: 205 additions & 0 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Exif AI

[![NPM Downloads](https://img.shields.io/npm/dw/exif-ai)](https://www.npmjs.com/package/exif-ai)

## 关于

_Exif AI_ 是一个强大的命令行工具,旨在直接将AI生成的图像描述和/或标签写入图像文件的元数据。此工具利用先进的AI模型来分析图像内容并生成描述性元数据,从而提高图像的可用性和可搜索性。

## 使用示例

### 命令行

#### 免安装

如果您不想全局安装 Exif AI,可以使用 npx 命令直接运行。

```bash
npx exif-ai -i example.jpeg -a ollama
```

#### 安装版

如果您已经全局安装了 Exif AI,则可以直接从命令行运行它。

```bash
exif-ai -i example.jpeg -a ollama
```

#### 选项

必选项:

- `-a, --api-provider <value>`: 要使用的AI供应商名称(`ollama`代表Ollama,`zhipu`代表ZhipuAI,`google`代表Google Gemini)

可选项:

- `-T, --tasks <tasks...>`: 要执行的任务列表(`description`代表生成描述,`tags`代表生成标签,`face`代表面部识别)。
- `-i, --input <file>` : 要处理的图像文件。
- `-p, --description-prompt <text>`: 自定义AI供应商生成描述的提示。默认为通用的图像描述提示。
- `--tag-prompt <text>`: 自定义AI供应商生成标签的提示。默认为通用的图像标签提示。
- `-m, --model <name>`: 指定要使用的AI模型,如果AI供应商支持。
- `-t, --description-tags <tags...>`: 要写入描述的EXIF标签列表。默认为常见的描述标签。
- `--tag-tags <tags...>`: 要写入标签的EXIF标签列表。默认为常见的标签。
- `-v, --verbose`: 启用调试输出。
- `-d, --dry-run`: 预览AI生成的内容而不写入图像文件。
- `--exif-tool-write-args <args...>`: 用于写入元数据的ExifTool的额外参数。
- `--provider-args <args...>`: AI供应商的额外参数。
- `-w, --watch <path>`: 监视要处理的目录中的新文件。
- `--avoid-overwrite`: 如果文件中已经存在EXIF标签,则避免覆盖。
- `--ext <extensions...>`: 要监视的文件扩展名。只有具有这些扩展名的文件才会被处理。
- `--concurrency <number>`: 在监视模式下同时处理的文件数量。
- `--face-group-ids <group...>`: 要用于面部识别的面部组ID列表。

示例用法:

```bash
exif-ai -i example.jpg -a ollama -p "描述这张图片"
```

### 作为库使用

要在您的项目中将Exif AI用作库,请导入它并使用提供的函数:

```typescript
import { execute } from "exif-ai";

const options = {
tasks: ["description"], // 要执行的任务列表
input: "example.jpg", // 要处理的图像文件
provider: "ollama", // 要使用的AI供应商名称
descriptionTags: ["Description"], // 要写入描述的EXIF标签列表
tagTags: ["TagsList"], // 要写入标签的EXIF标签列表
descriptionPrompt: "描述这张图片", // 自定义AI供应商生成描述的提示
tagPrompt: "根据主题、对象、事件、地点标记这张图片", // 自定义AI供应商生成标签的提示
verbose: true, // 启用调试输出
dry: false, // 预览AI生成的内容而不写入图像文件
writeArgs: [], // 用于写入元数据的ExifTool的额外参数
providerArgs: [], // AI供应商的额外参数
avoidOverwrite: false, // 如果文件中已经存在EXIF标签,则避免覆盖
doNotEndExifTool: false, // 不在写入元数据后结束ExifTool进程
};

execute(options)
.then((result) => {
console.log(result); // 处理结果
})
.catch((error) => {
console.error(error); // 处理错误
});
```

## 安装

要全局安装 Exif AI,请使用以下命令:

```bash
npm install -g exif-ai
```

## 任务

### 生成描述

`description`任务使用AI供应商生成图像的描述。该描述将被写入在`descriptionTags`中。

### 生成标签

`tags`任务使用AI供应商生成图像的标签。标签将被写入在`tagTags`中。

### 面部识别

`face`任务使用腾讯云API在图像上执行面部识别。面部识别结果将写入在`tagTags`中定义的EXIF标签。

目前,`face`任务需要腾讯云API密钥,并且需要腾讯云人脸识别服务。如果您没有腾讯云账户,请先注册一个账户并启用人脸识别服务。

```bash
export TENCENTCLOUD_SECRET_ID=your_tencentcloud_secret_id
export TENCENTCLOUD_SECRET_KEY=your_tencentcloud_secret_key
```

### 注意

请确保您安全地管理您的API密钥。不要在公共仓库或其他公共论坛中暴露它们。

## API供应商

Exif AI依赖于API供应商来生成图像描述和标签。目前,我们支持三个供应商:ZhipuAI、Ollama和Google Gemini。

### 支持的供应商

- ZhipuAI:领先的AI服务供应商。需要API密钥。
- Ollama:在您的机器上运行的本地AI服务,无需API密钥。
- Google Gemini:由Google提供的强大AI服务。

### 自定义供应商

您还可以通过实现供应商接口来开发您自己的自定义供应商。这允许您与其他AI服务集成或自定义描述生成过程。

## 配置

### 设置API密钥(适用于ZhipuAI)

要使用[ZhipuAI](https://open.bigmodel.cn/usercenter/apikeys),您需要设置API密钥。您可以通过设置环境变量来完成此操作:

```bash
export ZHIPUAI_API_KEY=your_zhipuai_api_key
```

### Google Gemini

要使用[Google Gemini](https://ai.google.dev/),您需要设置API密钥。您可以通过设置环境变量来完成此操作:

```bash
export API_KEY=your_google_api_key
```

### Ollama Configuration

Ollama在本地运行,不需要API密钥。请确保Ollama已安装在您的机器上并正确配置。有关安装和设置说明,请参考[Ollama](https://github.com/ollama/ollama)

要使用远程Ollama服务,您可以在`providerArgs`中定义URL:

```bash
exif-ai --providerArgs "http://ollama.example.com:8080" -a ollama -i image.jpg
```

```js
providerArgs: ["http://ollama.example.com:8080"],
```

## 开发

### 前置条件

- Node.js >=16
- pnpm

### 克隆仓库

首先,将Exif AI仓库克隆到您的本地机器:

```bash
git clone https://github.com/tychenjiajun/exif-ai.git
cd exif-ai
```

### 安装依赖

接下来,使用 pnpm 安装所需的依赖项。

```bash
pnpm install
```

### 构建

```bash
pnpm run build
```

### Watch

```bash
pnpm run watch
```
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "exif-ai",
"version": "3.0.9",
"version": "3.1.0",
"description": "A Node.js CLI and library that uses Ollama, ZhipuAI or Google Gemini to intelligently write image description and/or tags to exif metadata by it's content.",
"homepage": "https://github.com/tychenjiajun/exif-ai",
"repository": {
Expand Down Expand Up @@ -60,6 +60,7 @@
"ollama": "^0.5.8",
"p-limit": "^6.1.0",
"sharp": "0.32.6",
"tencentcloud-sdk-nodejs-iai": "^4.0.918",
"ts-extras": "^0.13.0",
"xhr2": "^0.2.1"
},
Expand Down
Loading

0 comments on commit f2ee9cc

Please sign in to comment.