Skip to content

h2-ml-ocr/getting_started

 
 

Repository files navigation

Getting Started With OCR4all

As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety of historical printings and obtain high quality results with reasonable time expenditure. Therefore, OCR4all is explicitly planned even for users with no technical background.

This repository contains two guides. One deals with the installation process, the second one gives a brief overview over the functionality using two historical books (also available in this repository) as hands-on examples.

To get started, it is recommended to download the entire repository ("Clone or download" -> "Download ZIP"), install OCR4all by following the setup guide, and then get some hand-on experience by working with the examples covered in the short user guide.

Both guides will be continuously improved and refined. Therefore, user feedback is always welcome.

Mailing List

OCR4all is under active development and consequently, frequent releases containing bug fixes and further functionality can be expected. In order to always be up to date, we highly recommend subscribing to our mailing list where we will always announce notable enhancements.

Current Developments

Plans for the (very) near future:

  • Enabling a second project management approach solely based on PageXML allowing for a more flexible workflow.
  • Integrating Tesseract for recognition.
  • Many minor bug fixes and improvements.

About

guides and test data for OCR4all

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

No contributors