Skip to content

Commit 74be603

Browse files
authored
add multi lang ocr (PaddlePaddle#1702)
1 parent d53f641 commit 74be603

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+3314
-6447
lines changed

Diff for: README.md

+1
Original file line numberDiff line numberDiff line change
@@ -231,3 +231,4 @@ We welcome you to contribute code to PaddleHub, and thank you for your feedback.
231231
* Many thanks to [zl1271](https://github.com/zl1271) for fixing serving docs typo
232232
* Many thanks to [AK391](https://github.com/AK391) for adding the webdemo of UGATIT and deoldify models in Hugging Face spaces
233233
* Many thanks to [itegel](https://github.com/itegel) for fixing quick start docs typo
234+
* Many thanks to [AK391](https://github.com/AK391) for adding the webdemo of Photo2Cartoon model in Hugging Face spaces

Diff for: README_ch.md

+1
Original file line numberDiff line numberDiff line change
@@ -247,3 +247,4 @@ print(results)
247247
* 非常感谢[zl1271](https://github.com/zl1271)修复了serving文档中的错别字
248248
* 非常感谢[AK391](https://github.com/AK391)在Hugging Face spaces中添加了UGATIT和deoldify模型的web demo
249249
* 非常感谢[itegel](https://github.com/itegel)修复了快速开始文档中的错别字
250+
* 非常感谢[AK391](https://github.com/AK391)在Hugging Face spaces中添加了Photo2Cartoon模型的web demo

Diff for: docs/docs_en/visualization.md

+2
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@
5050

5151
**UGATIT Selfie2anime Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/U-GAT-IT-selfie2anime)
5252

53+
**Photo2Cartoon Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/photo2cartoon)
54+
5355

5456
### Object Detection
5557
- Pedestrian detection, vehicle detection, and more industrial-grade ultra-large-scale pretrained models are provided.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# arabic_ocr_db_crnn_mobile
2+
3+
|模型名称|arabic_ocr_db_crnn_mobile|
4+
| :--- | :---: |
5+
|类别|图像-文字识别|
6+
|网络|Differentiable Binarization+CRNN|
7+
|数据集|icdar2015数据集|
8+
|是否支持Fine-tuning||
9+
|最新更新日期|2021-12-2|
10+
|数据指标|-|
11+
12+
13+
## 一、模型基本信息
14+
15+
- ### 模型介绍
16+
17+
- arabic_ocr_db_crnn_mobile Module用于识别图片当中的阿拉伯文字,包括阿拉伯文、波斯文、维吾尔文。其基于multi_languages_ocr_db_crnn检测得到的文本框,继续识别文本框中的阿拉伯文文字。最终识别文字算法采用CRNN(Convolutional Recurrent Neural Network)即卷积递归神经网络。其是DCNN和RNN的组合,专门用于识别图像中的序列式对象。与CTC loss配合使用,进行文字识别,可以直接从文本词级或行级的标注中学习,不需要详细的字符级的标注。该Module是一个识别阿拉伯文的轻量级OCR模型,支持直接预测。
18+
19+
- 更多详情参考:
20+
- [Real-time Scene Text Detection with Differentiable Binarization](https://arxiv.org/pdf/1911.08947.pdf)
21+
- [An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
22+
23+
24+
25+
## 二、安装
26+
27+
- ### 1、环境依赖
28+
29+
- PaddlePaddle >= 2.0.2
30+
31+
- Python >= 3.6
32+
33+
- PaddleOCR >= 2.0.1 | [如何安装PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_ch/quickstart.md#1)
34+
35+
- PaddleHub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
36+
37+
- Paddle2Onnx >= 0.9.0 | [如何安装paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md)
38+
39+
- shapely
40+
41+
- pyclipper
42+
43+
- ```shell
44+
$ pip3.6 install "paddleocr==2.3.0.2"
45+
$ pip3.6 install shapely -i https://pypi.tuna.tsinghua.edu.cn/simple
46+
$ pip3.6 install pyclipper -i https://pypi.tuna.tsinghua.edu.cn/simple
47+
```
48+
- **该Module依赖于第三方库shapely和pyclipper,使用该Module之前,请先安装shapely和pyclipper。**
49+
50+
- ### 2、安装
51+
52+
- ```shell
53+
$ hub install arabic_ocr_db_crnn_mobile
54+
```
55+
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
56+
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
57+
58+
59+
60+
## 三、模型API预测
61+
62+
- ### 1、命令行预测
63+
64+
- ```shell
65+
$ hub run arabic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
66+
$ hub run arabic_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" --det True --rec True --use_angle_cls True --box_thresh 0.7 --angle_classification_thresh 0.8 --visualization True
67+
```
68+
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
69+
70+
- ### 2、代码示例
71+
72+
- ```python
73+
import paddlehub as hub
74+
import cv2
75+
76+
ocr = hub.Module(name="arabic_ocr_db_crnn_mobile", enable_mkldnn=True) # mkldnn加速仅在CPU下有效
77+
result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
78+
79+
# or
80+
# result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
81+
```
82+
83+
- ### 3、API
84+
85+
- ```python
86+
def __init__(self,
87+
det=True,
88+
rec=True,
89+
use_angle_cls=False,
90+
enable_mkldnn=False,
91+
use_gpu=False,
92+
box_thresh=0.6,
93+
angle_classification_thresh=0.9)
94+
```
95+
96+
- 构造ArabicOCRDBCRNNMobile对象
97+
98+
- **参数**
99+
- det(bool): 是否开启文字检测。默认为True。
100+
- rec(bool): 是否开启文字识别。默认为True。
101+
- use_angle_cls(bool): 是否开启方向分类, 用于设置使用方向分类器识别180度旋转文字。默认为False。
102+
- enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
103+
- use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
104+
- box\_thresh (float): 检测文本框置信度的阈值;
105+
- angle_classification_thresh(float): 文本方向分类置信度的阈值
106+
107+
108+
- ```python
109+
def recognize_text(images=[],
110+
paths=[],
111+
output_dir='ocr_result',
112+
visualization=False)
113+
```
114+
115+
- 预测API,检测输入图片中的所有文本的位置和识别文本结果。
116+
117+
- **参数**
118+
119+
- paths (list\[str\]): 图片的路径;
120+
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
121+
- output\_dir (str): 图片的保存路径,默认设为 ocr\_result;
122+
- visualization (bool): 是否将识别结果保存为图片文件, 仅有检测开启时有效, 默认为False;
123+
124+
- **返回**
125+
126+
- res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
127+
- data (list\[dict\]): 识别文本结果,列表中每一个元素为 dict,各字段为:
128+
- text(str): 识别得到的文本
129+
- confidence(float): 识别文本结果置信度
130+
- text_box_position(list): 文本框在原图中的像素坐标,4*2的矩阵,依次表示文本框左下、右下、右上、左上顶点的坐标,如果无识别结果则data为\[\]
131+
- orientation(str): 分类的方向,仅在只有方向分类开启时输出
132+
- score(float): 分类的得分,仅在只有方向分类开启时输出
133+
- save_path (str, optional): 识别结果的保存路径,如不保存图片则save_path为''
134+
135+
136+
## 四、服务部署
137+
138+
- PaddleHub Serving 可以部署一个目标检测的在线服务。
139+
140+
- ### 第一步:启动PaddleHub Serving
141+
142+
- 运行启动命令:
143+
- ```shell
144+
$ hub serving start -m arabic_ocr_db_crnn_mobile
145+
```
146+
147+
- 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
148+
149+
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
150+
151+
- ### 第二步:发送预测请求
152+
153+
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
154+
155+
- ```python
156+
import requests
157+
import json
158+
import cv2
159+
import base64
160+
161+
def cv2_to_base64(image):
162+
data = cv2.imencode('.jpg', image)[1]
163+
return base64.b64encode(data.tostring()).decode('utf8')
164+
165+
# 发送HTTP请求
166+
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
167+
headers = {"Content-type": "application/json"}
168+
url = "http://127.0.0.1:8866/predict/arabic_ocr_db_crnn_mobile"
169+
r = requests.post(url=url, headers=headers, data=json.dumps(data))
170+
171+
# 打印预测结果
172+
print(r.json()["results"])
173+
```

Diff for: modules/image/text_recognition/arabic_ocr_db_crnn_mobile/__init__.py

Whitespace-only changes.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
import paddlehub as hub
2+
from paddleocr.ppocr.utils.logging import get_logger
3+
from paddleocr.tools.infer.utility import base64_to_cv2
4+
from paddlehub.module.module import moduleinfo, runnable, serving
5+
6+
7+
@moduleinfo(
8+
name="arabic_ocr_db_crnn_mobile",
9+
version="1.1.0",
10+
summary="ocr service",
11+
author="PaddlePaddle",
12+
type="cv/text_recognition")
13+
class ArabicOCRDBCRNNMobile:
14+
def __init__(self,
15+
det=True,
16+
rec=True,
17+
use_angle_cls=False,
18+
enable_mkldnn=False,
19+
use_gpu=False,
20+
box_thresh=0.6,
21+
angle_classification_thresh=0.9):
22+
"""
23+
initialize with the necessary elements
24+
Args:
25+
det(bool): Whether to use text detector.
26+
rec(bool): Whether to use text recognizer.
27+
use_angle_cls(bool): Whether to use text orientation classifier.
28+
enable_mkldnn(bool): Whether to enable mkldnn.
29+
use_gpu (bool): Whether to use gpu.
30+
box_thresh(float): the threshold of the detected text box's confidence
31+
angle_classification_thresh(float): the threshold of the angle classification confidence
32+
"""
33+
self.logger = get_logger()
34+
self.model = hub.Module(
35+
name="multi_languages_ocr_db_crnn",
36+
lang="arabic",
37+
det=det,
38+
rec=rec,
39+
use_angle_cls=use_angle_cls,
40+
enable_mkldnn=enable_mkldnn,
41+
use_gpu=use_gpu,
42+
box_thresh=box_thresh,
43+
angle_classification_thresh=angle_classification_thresh)
44+
self.model.name = self.name
45+
46+
def recognize_text(self, images=[], paths=[], output_dir='ocr_result', visualization=False):
47+
"""
48+
Get the text in the predicted images.
49+
Args:
50+
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
51+
paths (list[str]): The paths of images. If paths not images
52+
output_dir (str): The directory to store output images.
53+
visualization (bool): Whether to save image or not.
54+
Returns:
55+
res (list): The result of text detection box and save path of images.
56+
"""
57+
all_results = self.model.recognize_text(
58+
images=images, paths=paths, output_dir=output_dir, visualization=visualization)
59+
return all_results
60+
61+
@serving
62+
def serving_method(self, images, **kwargs):
63+
"""
64+
Run as a service.
65+
"""
66+
images_decode = [base64_to_cv2(image) for image in images]
67+
results = self.recognize_text(images_decode, **kwargs)
68+
return results
69+
70+
@runnable
71+
def run_cmd(self, argvs):
72+
"""
73+
Run as a command
74+
"""
75+
results = self.model.run_cmd(argvs)
76+
return results
77+
78+
def export_onnx_model(self, dirname: str, input_shape_dict=None, opset_version=10):
79+
'''
80+
Export the model to ONNX format.
81+
82+
Args:
83+
dirname(str): The directory to save the onnx model.
84+
input_shape_dict: dictionary ``{ input_name: input_value }, eg. {'x': [-1, 3, -1, -1]}``
85+
opset_version(int): operator set
86+
'''
87+
self.model.export_onnx_model(dirname=dirname, input_shape_dict=input_shape_dict, opset_version=opset_version)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
paddleocr>=2.3.0.2
2+
paddle2onnx>=0.9.0
3+
shapely
4+
pyclipper

0 commit comments

Comments
 (0)