Simple-OCR

Simple-OCR provides a more convenient way of reading PDF's and Images using the Tessaract Engine.

Installation Instructions

Install Tesseract.
Install ImageMagick.

Example Usage

It's very simple to use Simple-OCR:

# Specify the path of your source image or PDF.
img = OCR::Image.new("source.png")

# Specify the output file name, called "destination" here.
img.scan("destination", "-l eng", :pdf)

You can also give custom command line options.

img.scan("destination", "-l eng -psm 1...", :pdf)

It is also possible to specify the output file type, which can either be:

pdf
txt
hocr

img.scan("destination", "-l eng", :txt)
img.scan("destination", "-l eng", :hocr)

About

SimpleOCR is maintained and funded by Skcript. The names and logos for Skcript are properties of Skcript.

We love open source, and we have been doing quite a bit of contributions to the community. Take a look at them here. Also, encourage people around us to get involved in community operations. Join us, if you'd like to see the world change from our HQ.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Simple-OCR

Installation Instructions

Example Usage

About

Files

README.md

Latest commit

History

README.md

File metadata and controls

Simple-OCR

Installation Instructions

Example Usage

About