Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add new model: LMDeploy api #688

Merged
merged 6 commits into from
Dec 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions docs/en/EvalByLMDeploy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Using LMDeploy to Accelerate Evaluation and Inference

VLMEvalKit supports testing VLM models deployed by LMDeploy. Below, we use InternVL2-8B as an example to show how to test the model.

## Step 0: Install LMDeploy

```bash
pip install lmdeploy
```
For other installation methods, you can refer to LMDeploy's [documentation](https://github.com/InternLM/lmdeploy).

## Step 1: Start the Inference Service

```bash
lmdeploy serve api_server OpenGVLab/InternVL2-8B --model-name InternVL2-8B
```
> [!IMPORTANT]
> Since models in VLMEvalKit may have custom behaviors when building prompts for different datasets, such as InternVL2's handling of HallusionBench, it is necessary to specify `--model-name` when starting the server. This allows the VLMEvalKit to select appropriate prompt construction strategy based on the name when using the LMDeploy API.
>
> If `--server-port`, is specified, the corresponding environment variable `LMDEPLOY_API_BASE` needs to be set.


## Step 2: Evaluation

```bash
python run.py --data MMStar --model lmdeploy --verbose --nproc 64
```
7 changes: 7 additions & 0 deletions docs/en/Quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ To infer with API models (GPT-4v, Gemini-Pro-V, etc.) or use LLM APIs as the **j
# Hunyuan-Vision API
HUNYUAN_SECRET_KEY=
HUNYUAN_SECRET_ID=
# LMDeploy API
LMDEPLOY_API_BASE=
# You can also set a proxy for calling api models during the evaluation stage
EVAL_PROXY=
```
Expand Down Expand Up @@ -146,3 +148,8 @@ CUDA_VISIBLE_DEVICES=1,2,3 torchrun --nproc-per-node=3 run.py --data HallusionBe
```
- If the local judge LLM is not good enough in following the instructions, the evaluation may fail. Please report such failures (e.g., by issues).
- It's possible to deploy the judge LLM in different ways, e.g., use a private LLM (not from HuggingFace) or use a quantized LLM. Please refer to the [LMDeploy doc](https://lmdeploy.readthedocs.io/en/latest/serving/api_server.html). You can use any other deployment framework if they support OpenAI API.


### Using LMDeploy to Accelerate Evaluation and Inference

You can refer this [doc](/docs/en/EvalByLMDeploy.md)
28 changes: 28 additions & 0 deletions docs/zh-CN/EvalByLMDeploy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# 使用 LMDeploy 加速评测推理

VLMEvalKit 支持测试由 LMDeploy 部署的 VLM 模型,下面以 InternVL2-8B 为例,展示如何测试模型

## 第0步 安装 LMDeploy

```bash
pip install lmdeploy
```

其他安装方式可以参考 LMDeploy 的[文档](https://github.com/InternLM/lmdeploy)

## 第1步 启动推理服务

```bash
lmdeploy serve api_server OpenGVLab/InternVL2-8B --model-name InternVL2-8B
```
> [!IMPORTANT]
> 因为 VLMEvalKit 中的模型对于不同数据集在构建 prompt 时可能有自定义行为,如 InternVL2 对于 HallusionBench 的处理,所以,server 端在启动的时候需要指定 `--model-name`,这样在使用 LMDEploy api 时可以根据名字选择合适的 prompt 构建策略。
>
> 如果指定了 `--server-port`,需要设置对应的环境变量 `LMDEPLOY_API_BASE`


## 第2步 评测

```bash
python run.py --data MMStar --model InternVL2-8B --verbose --nproc 64
```
6 changes: 6 additions & 0 deletions docs/zh-CN/Quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ pip install -e .
# Hunyuan-Vision API
HUNYUAN_SECRET_KEY=
HUNYUAN_SECRET_ID=
# LMDeploy API
LMDEPLOY_API_BASE=
# 你可以设置一个评估时代理,评估阶段产生的 API 调用将通过这个代理进行
EVAL_PROXY=
```
Expand Down Expand Up @@ -145,3 +147,7 @@ CUDA_VISIBLE_DEVICES=1,2,3 torchrun --nproc-per-node=3 run.py --data HallusionBe
```
- 如果本地评判 LLM 在遵循指令方面不够好,评估过程可能会失败。请通过 issues 报告此类失败情况。
- 可以以不同的方式部署评判 LLM,例如使用私有 LLM(而非来自 HuggingFace)或使用量化 LLM。请参考 [LMDeploy doc](https://lmdeploy.readthedocs.io/en/latest/serving/api_server.html) 文档。也可以使用其他支持 OpenAI API 框架的方法。

### 使用 LMDeploy 加速模型推理

可参考[文档](/docs/zh-CN/EvalByLMDeploy.md)
3 changes: 2 additions & 1 deletion vlmeval/api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from .bluelm_v_api import BlueLMWrapper, BlueLM_V_API
from .jt_vl_chat import JTVLChatAPI
from .taiyi import TaiyiAPI
from .lmdeploy import LMDeployAPI
from .taichu import TaichuVLAPI


Expand All @@ -23,6 +24,6 @@
'Claude3V', 'Claude_Wrapper', 'Reka', 'GLMVisionAPI',
'CWWrapper', 'SenseChatVisionAPI', 'HunyuanVision', 'Qwen2VLAPI',
'BlueLMWrapper', 'BlueLM_V_API', 'JTVLChatAPI', 'bailingMMAPI',
'TaiyiAPI', 'TeleMMAPI', 'SiliconFlowAPI',
'TaiyiAPI', 'TeleMMAPI', 'SiliconFlowAPI', 'LMDeployAPI',
'TaichuVLAPI'
]
Loading
Loading