Skip to content

Commit

Permalink
[MMSIG-176] Add GLIP demo to Inference.md (#10472)
Browse files Browse the repository at this point in the history
  • Loading branch information
xin-li-67 committed Jun 28, 2023
1 parent ebdebdf commit 9e72ec4
Show file tree
Hide file tree
Showing 3 changed files with 201 additions and 2 deletions.
7 changes: 5 additions & 2 deletions configs/glip/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,12 @@ mim install mmdet[multimodal]
```shell
cd $MMDETROOT

python demo/multimodal_demo.py demo/demo.jpg "bench . car . " \
wget https://download.openmmlab.com/mmdetection/v3.0/glip/glip_tiny_a_mmdet-b3654169.pth

python demo/image_demo.py demo/demo.jpg \
configs/glip/glip_atss_swin-t_a_fpn_dyhead_pretrain_obj365.py \
https://download.openmmlab.com/mmdetection/v3.0/glip/glip_tiny_a_mmdet-b3654169.pth
glip_tiny_a_mmdet-b3654169.pth \
--texts 'bench . car .'
```

<div align=center>
Expand Down
98 changes: 98 additions & 0 deletions docs/en/user_guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,3 +186,101 @@ python demo/video_gpuaccel_demo.py demo/demo.mp4 \
checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth \
--nvdecode --out result.mp4
```

## Multi-modal algorithm inference demo and evaluation

As multimodal vision algorithms continue to evolve, MMDetection has also supported such algorithms. This section demonstrates how to use the demo and eval scripts corresponding to multimodal algorithms using the GLIP algorithm and model as the example. Moreover, MMDetection integrated a [gradio_demo project](../../../projects/gradio_demo/), which allows developers to quickly play with all image input tasks in MMDetection on their local devices. Check the [document](../../../projects/gradio_demo/README.md) for more details.

### Preparation

Please first make sure that you have the correct dependencies installed:

```shell
# if source
pip install -r requirements/multimodal.txt

# if wheel
mim install mmdet[multimodal]
```

MMDetection has already implemented GLIP algorithms and provided the weights, you can download directly from urls:

```shell
cd mmdetection
wget https://download.openmmlab.com/mmdetection/v3.0/glip/glip_tiny_a_mmdet-b3654169.pth
```

### Inference

Once the model is successfully downloaded, you can use the `demo/image_demo.py` script to run the inference.

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts bench
```

Demo result will be similar to this:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234547841-266476c8-f987-4832-8642-34357be621c6.png" height="300"/>
</div>

If users would like to detect multiple targets, please declare them in the format of `xx . xx .` after the `--texts`.

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'bench . car .'
```

And the result will be like this one:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234548156-ef9bbc2e-7605-4867-abe6-048b8578893d.png" height="300"/>
</div>

You can also use a sentence as the input prompt for the `--texts` field, for example:

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'There are a lot of cars here.'
```

The result will be similar to this:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234548490-d2e0a16d-1aad-4708-aea0-c829634fabbd.png" height="300"/>
</div>

### Evaluation

The GLIP implementation in MMDetection does not have any performance degradation, our benchmark is as follows:

| Model | official mAP | mmdet mAP |
| ----------------------- | :----------: | :-------: |
| glip_A_Swin_T_O365.yaml | 42.9 | 43.0 |
| glip_Swin_T_O365.yaml | 44.9 | 44.9 |
| glip_Swin_L.yaml | 51.4 | 51.3 |

Users can use the test script we provided to run evaluation as well. Here is a basic example:

```shell
# 1 gpu
python tools/test.py configs/glip/glip_atss_swin-t_fpn_dyhead_pretrain_obj365.py glip_tiny_a_mmdet-b3654169.pth

# 8 GPU
./tools/dist_test.sh configs/glip/glip_atss_swin-t_fpn_dyhead_pretrain_obj365.py glip_tiny_a_mmdet-b3654169.pth 8
```

The result will be similar to this:

```shell
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.428
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.594
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.466
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.300
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.477
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.534
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.473
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.690
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.789
```
98 changes: 98 additions & 0 deletions docs/zh_cn/user_guides/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,3 +185,101 @@ python demo/video_gpuaccel_demo.py demo/demo.mp4 \
checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth \
--nvdecode --out result.mp4
```

## 多模态算法的推理和验证

随着多模态视觉算法的不断发展,MMDetection 也完成了对这类算法的支持。这一小节我们通过 GLIP 算法和模型来演示如何使用对应多模态算法的 demo 和 eval 脚本。同时 MMDetection 也在 projects 下完成了 [gradio_demo 项目](../../../projects/gradio_demo/),用户可以参照[文档](../../../projects/gradio_demo/README.md)在本地快速体验 MMDetection 中支持的各类图片输入的任务。

### 模型准备

首先需要安装多模态依赖:

```shell
# if source
pip install -r requirements/multimodal.txt

# if wheel
mim install mmdet[multimodal]
```

MMDetection 已经集成了 glip 算法和模型,可以直接使用链接下载使用:

```shell
cd mmdetection
wget https://download.openmmlab.com/mmdetection/v3.0/glip/glip_tiny_a_mmdet-b3654169.pth
```

### 推理演示

下载完成后我们就可以利用 `demo` 下的多模态推理脚本完成推理:

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts bench
```

demo 效果如下图所示:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234547841-266476c8-f987-4832-8642-34357be621c6.png" height="300"/>
</div>

如果想进行多种类型的识别,需要使用 `xx . xx .` 的格式在 `--texts` 字段后声明目标类型:

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'bench . car .'
```

结果如下图所示:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234548156-ef9bbc2e-7605-4867-abe6-048b8578893d.png" height="300"/>
</div>

推理脚本还支持输入一个句子作为 `--texts` 字段的输入:

```shell
python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'There are a lot of cars here.'
```

结果可以参考下图:

<div align=center>
<img src="https://user-images.githubusercontent.com/17425982/234548490-d2e0a16d-1aad-4708-aea0-c829634fabbd.png" height="300"/>
</div>

### 验证演示

MMDetection 支持后的 GLIP 算法对比官方版本没有精度上的损失, benchmark 如下所示:

| Model | official mAP | mmdet mAP |
| ----------------------- | :----------: | :-------: |
| glip_A_Swin_T_O365.yaml | 42.9 | 43.0 |
| glip_Swin_T_O365.yaml | 44.9 | 44.9 |
| glip_Swin_L.yaml | 51.4 | 51.3 |

用户可以使用 `test.py` 脚本对模型精度进行验证,使用如下所示:

```shell
# 1 gpu
python tools/test.py configs/glip/glip_atss_swin-t_fpn_dyhead_pretrain_obj365.py glip_tiny_a_mmdet-b3654169.pth

# 8 GPU
./tools/dist_test.sh configs/glip/glip_atss_swin-t_fpn_dyhead_pretrain_obj365.py glip_tiny_a_mmdet-b3654169.pth 8
```

验证结果大致如下:

```shell
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.428
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.594
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.466
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.300
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.477
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.534
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.634
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.473
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.690
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.789
```

0 comments on commit 9e72ec4

Please sign in to comment.