Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR results mismatch between mokuro file and web reader #27

Open
kha-white opened this issue Jul 28, 2024 · 1 comment
Open

OCR results mismatch between mokuro file and web reader #27

kha-white opened this issue Jul 28, 2024 · 1 comment

Comments

@kha-white
Copy link
Collaborator

Reported here: kha-white/mokuro#106 but actually it's an issue with mokuro reader. .mokuro files contain a list of pages, each page with img_path key specifying which image does this item correspond to. Seems like the web reader ignores this value and relies on the pages being sorted the same way as images. This works most of the time, but sometimes the sorting is inconsistent which causes the mismatch.

Mokuro reader should use the img_path field to keep the association between the OCR results and image files.

@JaiWWW
Copy link

JaiWWW commented Jan 6, 2025

One addition to this (I was also about to report on the mokuro repo but then I realised it's a reader issue):

This particularly seems to be happening if the volume folder contains a subfolder for each chapter - the html generated by mokuro (and indeed the .mokuro file) displays all the pages in the correct order still and with the correct text, but while the reader correctly uses the first chapter's boxes and highlight text at the start, it displays the images from the last chapter.

Please fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants