You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of the Export multimodal Docling Example (examples/export_multimodal.py) has two issues:
UnboundLocalError: When no documents are successfully converted, the rows list is not initialized, resulting in an UnboundLocalError when trying to normalize the data into a DataFrame.
Loss of data from multiple documents: The rows list is reinitialized inside the loop that processes each document. This causes the data from previous documents to be discarded, keeping only the data from the last converted document.
Expected Behavior:
The rows list should accumulate the data from all successfully converted documents.
If no documents are successfully converted, the script should handle this gracefully and not raise an UnboundLocalError.
Suggested Fix:
Move the initialization of the rows list outside the loop so that it collects data from all documents.
Add a check before normalizing the rows into a DataFrame to ensure that the list is not empty.
Original code:
rows = [] # This is inside the document loop
for (
content_text,
content_md,
content_dt,
page_cells,
page_segments,
page,
) in generate_multimodal_pages(doc):
# Rows are appended here, but this only keeps data for the current document
...
Suggested Fix:
# Initialize rows before the loop
rows = []
for doc in converted_docs:
if doc.status != ConversionStatus.SUCCESS:
continue # Log failures
for (
content_text,
content_md,
content_dt,
page_cells,
page_segments,
page,
) in generate_multimodal_pages(doc):
rows.append( ... ) # Now rows accumulate data from all documents
The text was updated successfully, but these errors were encountered:
Description:
The current implementation of the Export multimodal Docling Example (
examples/export_multimodal.py
) has two issues:Expected Behavior:
Suggested Fix:
Original code:
Suggested Fix:
The text was updated successfully, but these errors were encountered: