😋 Optical Character Recognition (OCR)

Check the CHANGELOG file to have a global overview of the latest modifications ! 😋

Project structure

├── custom_architectures
│   ├── crnn_arch.py        : defines the CRNN main architecture for OCR (with CTC decoding)
│   ├── generation_utils.py : inference methods for CRNN with attention model *
│   ├── east_arch.py        : defines EAST text detector architecture
│   └── yolo_arch.py        : defines the YOLOv2 architecture
├── custom_layers
├── custom_train_objects
├── loggers
├── models
│   ├── detection           : used to detect texts in images (with the EAST detector)
│   ├── ocr
│   │   ├── base_ocr.py     : abstract class for OCR models
│   │   └── crnn.py         : main CRNN class (OCR)
├── pretrained_models
│   └── yolo_backend        : directory where to save the yolo_backend weights
├── unitests
├── utils
├── example_crnn.ipynb
└── pcr.ipynb

* This architecture is still experimental. Pretrained models / examples will be provided in the next update

Check the main project for more information about the unextended modules / structure / main classes.

Check the detection project for more information about the detection module and the EAST Scene-Text Detection model.

Available features

OCR (module models.ocr) :

Feature	Fuction / class	Description
OCR	`ocr`	Performs OCR on the given image(s)

You can check the ocr notebook for a concrete demonstration

Available models

Model architectures

Available architectures :

detection :
- EAST
OCR :
- CRNN

Model weights

Classes	Dataset	Architecture	Trainer	Weights

Models must be unzipped in the pretrained_models/ directory !

The pretrained CRNN models come from the EasyOCR library. Weights are automatically downloaded given the language or the model name, and converted in keras ! The easyocr is therefore not required, while pytorch is required for weights loading (for convertion).

The pretrained version of EAST can be downloaded from this project. It should be placed in pretrained_models/pretrained_weights/east_vgg16.pth (torch is required to convert the weights : pip install torch).

Installation and usage

Clone this repository : git clone https://github.com/yui-mhcp/ocr.git
Go to the root of this repository : cd ocr
Install requirements : pip install -r requirements.txt
Open ocr notebook and follow the instructions !

TO-DO list :

Make the TO-DO list
Convert the CRNN architecture / weights from the easyocr library to tensorflow
Convert the CRNN + attention architecture from this repo to tensorflow
Add examples to initialize pretrained models (both EAST and CRNN)
Add an example to perform OCR on image (with text detection)
Add an example to perform OCR on camera
Allow to combine texts in lines / paragraphs (as EAST detects individual words)
Take into account the text rotation in the combination procedure

Contacts and licence

Contacts :

Mail : [email protected]
Discord : yui0732

Terms of use

The goal of these projects is to support and advance education and research in Deep Learning technology. To facilitate this, all associated code is made available under the GNU Affero General Public License (AGPL) v3, supplemented by a clause that prohibits commercial use (cf the LICENCE file).

These projects are released as "free software", allowing you to freely use, modify, deploy, and share the software, provided you adhere to the terms of the license. While the software is freely available, it is not public domain and retains copyright protection. The license conditions are designed to ensure that every user can utilize and modify any version of the code for their own educational and research projects.

If you wish to use this project in a proprietary commercial endeavor, you must obtain a separate license. For further details on this process, please contact me directly.

For my protection, it is important to note that all projects are available on an "As Is" basis, without any warranties or conditions of any kind, either explicit or implied. However, do not hesitate to report issues on the repository's project, or make a Pull Request to solve it 😄

Citation

If you find this project useful in your work, please add this citation to give it more visibility ! 😋

@misc{yui-mhcp
    author  = {yui},
    title   = {A Deep Learning projects centralization},
    year    = {2021},
    publisher   = {GitHub},
    howpublished    = {\url{https://github.com/yui-mhcp}}
}

Notes and references

The code for the CRNN architecture is highly inspired from the easyocr repo :

EasyOCR library : official repo of the easyocr library The code for the EAST part of this project is highly inspired from this repo :
SakuraRiven pytorch implementation : pytorch implementation of the EAST paper.

Papers and tutorials :

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition : the original CRNN paper
What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis : a great benchmark of OCR model + an open-source repository with pretrained models and datasets
U-Net: Convolutional Networks for Biomedical Image Segmentation : U-net original paper
EAST: An Efficient and Accurate Scene Text Detector : text detection (with possibly rotated bounding-boxes) with a segmentation model (U-Net).

Datasets :

COCO Text dataset : an extension of COCO for text detection

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
custom_architectures		custom_architectures
custom_layers		custom_layers
custom_train_objects		custom_train_objects
docker		docker
loggers		loggers
models		models
unitests		unitests
utils		utils
.gitignore		.gitignore
AGPLv3.licence		AGPLv3.licence
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ocr.ipynb		ocr.ipynb
requirements.txt		requirements.txt
text.jpg		text.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

😋 Optical Character Recognition (OCR)

Project structure

Available features

Available models

Model architectures

Model weights

Installation and usage

TO-DO list :

Contacts and licence

Terms of use

Citation

Notes and references

About

Releases

Packages

Languages

License

yui-mhcp/ocr

Folders and files

Latest commit

History

Repository files navigation

😋 Optical Character Recognition (OCR)

Project structure

Available features

Available models

Model architectures

Model weights

Installation and usage

TO-DO list :

Contacts and licence

Terms of use

Citation

Notes and references

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages