feat: add face task

tychenjiajun · Sep 17, 2024 · f2ee9cc · f2ee9cc
1 parent e02b156
commit f2ee9cc
Show file tree

Hide file tree

Showing 9 changed files with 602 additions and 54 deletions.
diff --git a/README.md b/README.md
@@ -2,6 +2,9 @@
 
 [![NPM Downloads](https://img.shields.io/npm/dw/exif-ai)](https://www.npmjs.com/package/exif-ai)
 
+_Read this in other languages:_
+[_简体中文_](README.zh-CN.md),
+
 ## About
 
 _Exif AI_ is a powerful CLI tool designed to write AI-generated image descriptions and/or tags directly into the metadata of image files. This tool leverages advanced AI models to analyze image content and generate descriptive metadata, enhancing the accessibility and searchability of your images.
@@ -30,11 +33,11 @@ exif-ai -i example.jpeg -a ollama
 
 Required options:
 
-- `-a, --api-provider <value>` Name of the AI provider to use (`ollama` for Ollama, `zhipu` for ZhipuAI, `google` for Google Gemini).
+- `-a, --api-provider <value>`: Name of the AI provider to use (`ollama` for Ollama, `zhipu` for ZhipuAI, `google` for Google Gemini).
 
 Optional options:
 
-- `-T, --tasks <tasks...>`: List of tasks to perform ('description' and/or 'tag').
+- `-T, --tasks <tasks...>`: List of tasks to perform ('description', 'tag', 'face').
 - `-i, --input <file>` Path to the input image file.
 - `-p, --description-prompt <text>`: Custom prompt for the AI provider to generate description. Defaults to a generic image description prompt.
 - `--tag-prompt <text>`: Custom prompt for the AI provider to generate tags. Defaults to a generic image tagging prompt.
@@ -49,6 +52,7 @@ Optional options:
 - `--avoid-overwrite`: Avoid overwriting if EXIF tags already exist in the file.
 - `--ext <extensions...>`: File extensions to watch. Only files with this extensions will be processed.
 - `--concurrency <number>`: The numbers of files to process concurrently in watch mode.
+- `--face-group-ids <group...>` List of face group IDs to use for face recognition.
 
 Example usage:
 
@@ -64,24 +68,25 @@ To use Exif AI as a library in your project, import it and use the provided func
 import { execute } from "exif-ai";
 
 const options = {
-  path: "example.jpeg", // Path to the input image file
-  provider: "ollama", // AI provider to use (e.g., 'ollama', 'zhipu', 'google')
-  model: "moondream", // Optional: Specific AI model to use (if supported by the provider)
+  tasks: ["description"], // List of tasks to perform
+  path: "example.jpg", // Path to the input image file
+  provider: "ollama", // Name of the AI provider to use
   descriptionTags: [
     "XPComment",
     "Description",
     "ImageDescription",
     "Caption-Abstract",
-  ], // Optional: EXIF tags to write the description to
-  tagTags: ["Subject", "TagsList", "Keywords"], // Optional: EXIF tags to write the tags to
-  descriptionPrompt: "请使用中文描述这个图片。", // Optional: Custom prompt for the AI provider to generate description
-  tagPrompt:
-    "Tag this image based on subject, object, event, place. Output format: <tag1>, <tag2>, <tag3>, <tag4>,  <tag5>,  ..., <tagN>", // Optional: Custom prompt for the AI provider to generate tags
-  verbose: false, // Optional: Enable verbose logging for debugging
-  dry: false, // Optional: Perform a dry run without writing to the file
-  writeArgs: [], // Optional: Additional arguments for EXIF write task
-  providerArgs: [], // Optional: Additional arguments for the AI provider
-  avoidOverwrite: true, // Optional: Avoid overwriting existing tags
+  ], // List of EXIF tags to write the description to
+  tagTags: ["Subject", "TagsList", "Keywords"], // List EXIF tags to write the tags to
+  descriptionPrompt: "Describe this landscape photo.", // Custom prompt for the AI provider to generate description
+  tagPrompt: "Tag this image based on subject, object, event, place.", // Custom prompt for the AI provider to generate tags
+  verbose: false, // Enable verbose output for debugging
+  dry: false, // Preview AI-generated content without writing to the image file
+  writeArgs: [], // Additional ExifTool arguments for writing metadata
+  providerArgs: [], // Additional arguments for the AI provider
+  avoidOverwrite: false, // Avoid overwriting if EXIF tags already exist in the file
+  doNotEndExifTool: false, // Do not end ExifTool process after writing metadata
+  faceGroupIds: [], // List of face group IDs to use for face recognition
 };
 
 execute(options)
@@ -101,9 +106,34 @@ To install Exif AI globally, use the following command:
 npm install -g exif-ai
 ```
 
+## Tasks
+
+### Description
+
+The `description` task generates a description of the image using the AI provider. The description is written to the specified EXIF tags defined in `descriptionTags`.
+
+### Tag
+
+The `tag` task generates tags for the image using the AI provider. The tags are written to the specified EXIF tags defined in `tagTags`.
+
+### Face Recognition
+
+The `face` task performs face recognition on the image using the [Tencent Cloud API](https://cloud.tencent.com/document/api/867/44994). The face recognition results are written to the specified EXIF tags defined in `tagTags`.
+
+Currently, the `face` task requires user to enable face recognition service on Tencent Cloud and set a pair of Tencent Cloud API Secret ID and Tencent CLoud API Secret Key in the environment variable.
+
+```bash
+export TENCENTCLOUD_SECRET_ID=your_tencentcloud_secret_id
+export TENCENTCLOUD_SECRET_KEY=your_tencentcloud_secret_key
+```
+
+### Note
+
+Please ensure that you securely manage your API keys. Do not expose them in public repositories or other public forums.
+
 ## API Providers
 
-Exif AI relies on API providers to generate image descriptions. Currently, we support three providers: ZhipuAI, Ollama and Google Gemini.
+Exif AI relies on API providers to generate image descriptions and tags. Currently, we support three providers: ZhipuAI, Ollama and Google Gemini.
 
 ### Supported Providers
 
@@ -137,6 +167,16 @@ export API_KEY=your_google_api_key
 
 Ollama runs locally and does not require an API key. Ensure that Ollama is installed and properly configured on your machine. Refer to the [Ollama GitHub repository](https://github.com/ollama/ollama) for installation and setup instructions.
 
+To use remote Ollama service, you can defined the url in providerArgs:
+
+```bash
+exif-ai --providerArgs "http://ollama.example.com:8080" -a ollama -i image.jpg
+```
+
+```js
+providerArgs: ["http://ollama.example.com:8080"],
+```
+
 ## Develop
 
 ### Prerequisites
@@ -151,7 +191,7 @@ First, clone the Exif AI repository to your local machine:
 ```bash
 git clone https://github.com/tychenjiajun/exif-ai.git
 cd exif-ai
-````
+```
 
 ### Install Dependencies
 

diff --git a/README.zh-CN.md b/README.zh-CN.md
@@ -0,0 +1,205 @@
+# Exif AI
+
+[![NPM Downloads](https://img.shields.io/npm/dw/exif-ai)](https://www.npmjs.com/package/exif-ai)
+
+## 关于
+
+_Exif AI_ 是一个强大的命令行工具，旨在直接将AI生成的图像描述和/或标签写入图像文件的元数据。此工具利用先进的AI模型来分析图像内容并生成描述性元数据，从而提高图像的可用性和可搜索性。
+
+## 使用示例
+
+### 命令行
+
+#### 免安装
+
+如果您不想全局安装 Exif AI，可以使用 npx 命令直接运行。
+
+```bash
+npx exif-ai -i example.jpeg -a ollama
+```
+
+#### 安装版
+
+如果您已经全局安装了 Exif AI，则可以直接从命令行运行它。
+
+```bash
+exif-ai -i example.jpeg -a ollama
+```
+
+#### 选项
+
+必选项:
+
+- `-a, --api-provider <value>`: 要使用的AI供应商名称（`ollama`代表Ollama，`zhipu`代表ZhipuAI，`google`代表Google Gemini）
+
+可选项:
+
+- `-T, --tasks <tasks...>`: 要执行的任务列表（`description`代表生成描述，`tags`代表生成标签，`face`代表面部识别）。
+- `-i, --input <file>` : 要处理的图像文件。
+- `-p, --description-prompt <text>`: 自定义AI供应商生成描述的提示。默认为通用的图像描述提示。
+- `--tag-prompt <text>`: 自定义AI供应商生成标签的提示。默认为通用的图像标签提示。
+- `-m, --model <name>`: 指定要使用的AI模型，如果AI供应商支持。
+- `-t, --description-tags <tags...>`: 要写入描述的EXIF标签列表。默认为常见的描述标签。
+- `--tag-tags <tags...>`: 要写入标签的EXIF标签列表。默认为常见的标签。
+- `-v, --verbose`: 启用调试输出。
+- `-d, --dry-run`: 预览AI生成的内容而不写入图像文件。
+- `--exif-tool-write-args <args...>`: 用于写入元数据的ExifTool的额外参数。
+- `--provider-args <args...>`: AI供应商的额外参数。
+- `-w, --watch <path>`: 监视要处理的目录中的新文件。
+- `--avoid-overwrite`: 如果文件中已经存在EXIF标签，则避免覆盖。
+- `--ext <extensions...>`: 要监视的文件扩展名。只有具有这些扩展名的文件才会被处理。
+- `--concurrency <number>`: 在监视模式下同时处理的文件数量。
+- `--face-group-ids <group...>`: 要用于面部识别的面部组ID列表。
+
+示例用法:
+
+```bash
+exif-ai -i example.jpg -a ollama -p "描述这张图片"
+```
+
+### 作为库使用
+
+要在您的项目中将Exif AI用作库，请导入它并使用提供的函数：
+
+```typescript
+import { execute } from "exif-ai";
+
+const options = {
+  tasks: ["description"], // 要执行的任务列表
+  input: "example.jpg", // 要处理的图像文件
+  provider: "ollama", // 要使用的AI供应商名称
+  descriptionTags: ["Description"], // 要写入描述的EXIF标签列表
+  tagTags: ["TagsList"], // 要写入标签的EXIF标签列表
+  descriptionPrompt: "描述这张图片", // 自定义AI供应商生成描述的提示
+  tagPrompt: "根据主题、对象、事件、地点标记这张图片", // 自定义AI供应商生成标签的提示
+  verbose: true, // 启用调试输出
+  dry: false, // 预览AI生成的内容而不写入图像文件
+  writeArgs: [], // 用于写入元数据的ExifTool的额外参数
+  providerArgs: [], // AI供应商的额外参数
+  avoidOverwrite: false, // 如果文件中已经存在EXIF标签，则避免覆盖
+  doNotEndExifTool: false, // 不在写入元数据后结束ExifTool进程
+};
+
+execute(options)
+  .then((result) => {
+    console.log(result); // 处理结果
+  })
+  .catch((error) => {
+    console.error(error); // 处理错误
+  });
+```
+
+## 安装
+
+要全局安装 Exif AI，请使用以下命令：
+
+```bash
+npm install -g exif-ai
+```
+
+## 任务
+
+### 生成描述
+
+`description`任务使用AI供应商生成图像的描述。该描述将被写入在`descriptionTags`中。
+
+### 生成标签
+
+`tags`任务使用AI供应商生成图像的标签。标签将被写入在`tagTags`中。
+
+### 面部识别
+
+`face`任务使用腾讯云API在图像上执行面部识别。面部识别结果将写入在`tagTags`中定义的EXIF标签。
+
+目前，`face`任务需要腾讯云API密钥，并且需要腾讯云人脸识别服务。如果您没有腾讯云账户，请先注册一个账户并启用人脸识别服务。
+
+```bash
+export TENCENTCLOUD_SECRET_ID=your_tencentcloud_secret_id
+export TENCENTCLOUD_SECRET_KEY=your_tencentcloud_secret_key
+```
+
+### 注意
+
+请确保您安全地管理您的API密钥。不要在公共仓库或其他公共论坛中暴露它们。
+
+## API供应商
+
+Exif AI依赖于API供应商来生成图像描述和标签。目前，我们支持三个供应商：ZhipuAI、Ollama和Google Gemini。
+
+### 支持的供应商
+
+- ZhipuAI：领先的AI服务供应商。需要API密钥。
+- Ollama：在您的机器上运行的本地AI服务，无需API密钥。
+- Google Gemini：由Google提供的强大AI服务。
+
+### 自定义供应商
+
+您还可以通过实现供应商接口来开发您自己的自定义供应商。这允许您与其他AI服务集成或自定义描述生成过程。
+
+## 配置
+
+### 设置API密钥（适用于ZhipuAI）
+
+要使用[ZhipuAI](https://open.bigmodel.cn/usercenter/apikeys)，您需要设置API密钥。您可以通过设置环境变量来完成此操作：
+
+```bash
+export ZHIPUAI_API_KEY=your_zhipuai_api_key
+```
+
+### Google Gemini
+
+要使用[Google Gemini](https://ai.google.dev/)，您需要设置API密钥。您可以通过设置环境变量来完成此操作：
+
+```bash
+export API_KEY=your_google_api_key
+```
+
+### Ollama Configuration
+
+Ollama在本地运行，不需要API密钥。请确保Ollama已安装在您的机器上并正确配置。有关安装和设置说明，请参考[Ollama](https://github.com/ollama/ollama)。
+
+要使用远程Ollama服务，您可以在`providerArgs`中定义URL：
+
+```bash
+exif-ai --providerArgs "http://ollama.example.com:8080" -a ollama -i image.jpg
+```
+
+```js
+providerArgs: ["http://ollama.example.com:8080"],
+```
+
+## 开发
+
+### 前置条件
+
+- Node.js >=16
+- pnpm
+
+### 克隆仓库
+
+首先，将Exif AI仓库克隆到您的本地机器：
+
+```bash
+git clone https://github.com/tychenjiajun/exif-ai.git
+cd exif-ai
+```
+
+### 安装依赖
+
+接下来，使用 pnpm 安装所需的依赖项。
+
+```bash
+pnpm install
+```
+
+### 构建
+
+```bash
+pnpm run build
+```
+
+### Watch
+
+```bash
+pnpm run watch
+```
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "exif-ai",
-  "version": "3.0.9",
+  "version": "3.1.0",
   "description": "A Node.js CLI and library that uses Ollama, ZhipuAI or Google Gemini to intelligently write image description and/or tags to exif metadata by it's content.",
   "homepage": "https://github.com/tychenjiajun/exif-ai",
   "repository": {
@@ -60,6 +60,7 @@
     "ollama": "^0.5.8",
     "p-limit": "^6.1.0",
     "sharp": "0.32.6",
+    "tencentcloud-sdk-nodejs-iai": "^4.0.918",
     "ts-extras": "^0.13.0",
     "xhr2": "^0.2.1"
   },