Skip to content

Commit 198a33c

Browse files
committed
Update Readme
1 parent 077e54c commit 198a33c

File tree

1 file changed

+42
-1
lines changed

1 file changed

+42
-1
lines changed

README.md

+42-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,43 @@
11
# saram - Image/PDF OCR conversion
2-
Get OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract
2+
Get OCR in txt form from an image or pdf extension supporting multiple files from directory using `pytesseract` with support for rotation in case of wrong orientation along.
3+
4+
**Currently in alpha state**
5+
6+
[![Srijana features](https://i.imgur.com/FDGpiwp.gif)](https://youtu.be/Cpj3XVdsK_g)
7+
8+
**Note:**
9+
Mkae sure you have a OCR tool like `tesseract` and certain data value for comparing OCR, eg `tesseract-data-eng` along with `Pillow` and `Wand` for image conversion and loading.
10+
11+
## Installation
12+
13+
Clone the source locally:
14+
```
15+
$ git clone https://github.com/aryaminus/saram
16+
$ cd saram
17+
$ git checkout py-module
18+
$ python main.py <dirname>
19+
```
20+
21+
## Todo
22+
- [x] Add support for PDF by PDF -> image -> txt with converted image deletion after processing
23+
- [x] Double check for orientation in case of image and PDF
24+
- [ ] Add NLP to process the most repeated frequent characters to filer content
25+
- [ ] Add Cloud Vision support for effective character recognization
26+
27+
## Reference
28+
1. <a href="https://github.com/lucab85/PDFtoTXT" target="_blank">PDFtoTXT</a>
29+
2. <a href="https://github.com/prabhakar267/ocr-convert-image-to-text" target="_blank">ocr-convert-image-to-text</a>
30+
3. <a href="https://pastebin.com/QFMpp28T" target="_blank">Fix-image-rotation</a>
31+
32+
33+
-----------------------------------------------------------------------------------------------------------
34+
35+
## Contributing
36+
37+
1. Fork it (<https://github.com/aryaminus/saram/fork>)
38+
2. Create your feature branch (`git checkout -b feature/fooBar`)
39+
3. Commit your changes (`git commit -am 'Add some fooBar'`)
40+
4. Push to the branch (`git push origin feature/fooBar`)
41+
5. Create a new Pull Request
42+
43+
**Enjoy!**

0 commit comments

Comments
 (0)