Skip to content

Latest commit

 

History

History
28 lines (18 loc) · 473 Bytes

README.md

File metadata and controls

28 lines (18 loc) · 473 Bytes

Text Grabber

Extract Text from PDFs and Images Using Tesseract. Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0

Packages

  • Django
  • Tesseract
  • Pdf2Image

Usage

Prerequisites :

virtualenv venv

source venv/bin/activate

cd text_grabber

pip install -r requirements.txt

Run Django App :

python3 manage.py runserver 0.0.0.0:8000