Skip to content

Translate scanned/ camera-captured documents to overcome language barriers.

License

Notifications You must be signed in to change notification settings

kapitsa2811/OCRTranslator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Translator (Linux OS)

Keywords: OCR, Tesseract-OCR, Google Translate, Shell Script, Linux

1. Introduction: OCR Translator

Immigrants often struggle to understand letters in a foreign language received by mail. OCR Translator aims to overcome language barriers, by using Tesseract-OCR and Google Translate.

2. Workflow

notice: the preferred way is using a flatbed scanner, camera-based functionality will be added in future releases.

3. Config

  1. Install Tesseract OCR; at time of writing, tesseract 4.0.0-beta.1 was used as OCR engine.

  2. Install dependencies (using conda virtualenv)

    # navigate to ./anaconda 
    conda env create --file environment.yml
    
    # activate OCR_Translator_env
    source activate OCR_Translator_env

Notes:

  • currently supported data types: PDF, png
  • one page only (multiple pdf pages won't work)

License

OCR_Translator_license

About

Translate scanned/ camera-captured documents to overcome language barriers.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages