Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsing of custom attributes #17

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

marijnkoolen
Copy link
Collaborator

@marijnkoolen marijnkoolen commented May 23, 2024

I've kept the structure that parse_custom_metadata returns the same so that tests pass (which hopefully keeps things backwards compatible). Next to the metadata keys that were set, the full list of custom attributes is now returned under metadata['custom_attributes'].

This makes metadata['custom_tags'] obsolete, but before removing it, I want to know if @LvanWissen is okay with it (since he's likely to be the only one using it at the moment).

Ideally, the custom attributes end up directly as a property of the corresponding TextLine, TextRegion or Word object. Let me know if you think that's better.

- Add functionality to parse arbitrary custom attributes
- Add referenced text to attributes with an offset and length (and text)
- Keep `reading_order` and `text_style` for backwards compatibility but use camelCased version in `custom_attributes`
@marijnkoolen
Copy link
Collaborator Author

Also, for custom attribute that have an offset and length, I've added the referenced text to the attribute as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant