You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ALTO supports a @BASELINE attribute that can define a polyline on which the text rests. hOCR also includes support for this information. These values could be used for a more accurate estimation of the font size and position used for rendering the SVG.
Unfortunately I don't have access to any samples of OCR data with this information at the moment.
The test fixtures now include an hOCR file (generated by Tesseract 4) that has baseline information. Since both hOCR and ALTO define baselines as polynomials, an hOCR-based implementation should work with ALTO with (hopefully) minimal modifications.
The text was updated successfully, but these errors were encountered:
ALTO supports a
@BASELINE
attribute that can define a polyline on which the text rests. hOCR also includes support for this information. These values could be used for a more accurate estimation of the font size and position used for rendering the SVG.Unfortunately I don't have access to any samples of OCR data with this information at the moment.The test fixtures now include an hOCR file (generated by Tesseract 4) that has baseline information. Since both hOCR and ALTO define baselines as polynomials, an hOCR-based implementation should work with ALTO with (hopefully) minimal modifications.
The text was updated successfully, but these errors were encountered: