Skip to content
This repository has been archived by the owner on Dec 16, 2023. It is now read-only.

Latest commit

 

History

History
48 lines (33 loc) · 1.43 KB

README.md

File metadata and controls

48 lines (33 loc) · 1.43 KB

Simple-OCR

Simple-OCR provides a more convenient way of reading PDF's and Images using the Tessaract Engine.

Installation Instructions

  1. Install Tesseract.
  2. Install ImageMagick.

Example Usage

It's very simple to use Simple-OCR:

# Specify the path of your source image or PDF.
img = OCR::Image.new("source.png")

# Specify the output file name, called "destination" here.
img.scan("destination", "-l eng", :pdf)

You can also give custom command line options.

img.scan("destination", "-l eng -psm 1...", :pdf)

It is also possible to specify the output file type, which can either be:

  • pdf
  • txt
  • hocr
img.scan("destination", "-l eng", :txt)
img.scan("destination", "-l eng", :hocr)

About

Skcript

SimpleOCR is maintained and funded by Skcript. The names and logos for Skcript are properties of Skcript.

We love open source, and we have been doing quite a bit of contributions to the community. Take a look at them here. Also, encourage people around us to get involved in community operations. Join us, if you'd like to see the world change from our HQ.