You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I was experimenting with DocOwl1.5 for bounding box prediction. Prompt I was using query=f'Predict the bounding box of the text <ocr> {text} </ocr> in the image.' The result is decent, but I am having a few questions.
What if there are duplicate words from the same document image, for example, there are 4 'water' in the image, is there a way to extract the bounding boxes for all of them? Or do we need to fine tune the model for adding this ability?
For some images, it is hallucinating, returning a description for the image, although I was trying to predict bounding box, for example, I was trying to predict the bounding box for TESCO, which is the vendor name for the following receipt, but it only returns
a receipt for a purchase of £ 0.95 <ocr> alamy TESCO metro TEL 0845 6779218 FRESH MILK 0.89 MJESLI a 2.29 DARK CHOCOLATE * 2 @ £0.95 1.90 TOTAL ala 5.08 MASTERCARD SALE 5.08 AID : A0000000041010 NUMBER ****************0938 ICC PAN SEQ NO : 02 AUTH CODE : 036017 MERCHANT : 1833431 START : 10/10 EXPIRY : 11/13 Cardholder PIN Verified CHANGE DUE 0.00 CLUBCARD STATEMENT ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ==========
Please let me know if you have any suggestions. Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi,
I was experimenting with DocOwl1.5 for bounding box prediction. Prompt I was using
query=f'Predict the bounding box of the text <ocr> {text} </ocr> in the image.'
The result is decent, but I am having a few questions.Please let me know if you have any suggestions. Thanks in advance!
The text was updated successfully, but these errors were encountered: