Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues for Bounding box prediction #107

Open
jackiexue1993 opened this issue Sep 4, 2024 · 0 comments
Open

Issues for Bounding box prediction #107

jackiexue1993 opened this issue Sep 4, 2024 · 0 comments

Comments

@jackiexue1993
Copy link

Hi,
I was experimenting with DocOwl1.5 for bounding box prediction. Prompt I was using
query=f'Predict the bounding box of the text <ocr> {text} </ocr> in the image.' The result is decent, but I am having a few questions.

  1. What if there are duplicate words from the same document image, for example, there are 4 'water' in the image, is there a way to extract the bounding boxes for all of them? Or do we need to fine tune the model for adding this ability?
  2. For some images, it is hallucinating, returning a description for the image, although I was trying to predict bounding box, for example, I was trying to predict the bounding box for TESCO, which is the vendor name for the following receipt, but it only returns
 a receipt for a purchase of £ 0.95 <ocr> alamy TESCO metro TEL 0845 6779218 FRESH MILK 0.89 MJESLI a 2.29 DARK CHOCOLATE * 2 @ £0.95 1.90 TOTAL ala 5.08 MASTERCARD SALE 5.08 AID : A0000000041010 NUMBER ****************0938 ICC PAN SEQ NO : 02 AUTH CODE : 036017 MERCHANT : 1833431 START : 10/10 EXPIRY : 11/13 Cardholder PIN Verified CHANGE DUE 0.00 CLUBCARD STATEMENT ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ================= ==========

Please let me know if you have any suggestions. Thanks in advance!
tesco-shopping-receipt-CNTYDX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant