Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 402 Bytes

README.md

File metadata and controls

3 lines (2 loc) · 402 Bytes

tabulator

Two methods to extract data from tabular structures contained in images. Methods attempt to preserve structure by using tesseract-generated bounding boxes. Method 1 was a preliminary method that only works for tables with a very defined structure and only single-lined headers. Method 2 may work for any table structure, though it is entirely dependent on the perfect accuracy of the OCR.