Skip to content

Conversation

@CzarX86
Copy link

@CzarX86 CzarX86 commented Aug 19, 2025

This PR adds comprehensive validation guards to prevent crashes when processing tables with empty data in the Marker PDF library.

…sorList errors

- Add validation to skip table processing when table_text_lines is empty
- Guard table recognition with try-catch to prevent crashes
- Skip empty tables before calling matrix_intersection_area
- Add validation for table_cells before processing
- Prevent 'stack expects a non-empty TensorList' errors

This fix addresses the issue where tables without text lines or cells
cause crashes in the Surya table recognition model.

Fixes: #XXX (table processing crashes with empty data)
@github-actions
Copy link
Contributor

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request

@n0kovo
Copy link

n0kovo commented Aug 20, 2025

This should be handled in surya IMO (datalab-to/surya#435)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants