-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can inference be done without boxes and transcripts using PICK-pytorch ???? #105
Comments
the model accepts both image and bounding boxes and corresponding transcripts as input, you can't only rely on image itself. |
Hi ziodos, Thanks for replying. I partially agree with your input because if a user need to select the new field which is not been trained using PICK-pytorch model might require its box_transcripts to be passed to test.py. But still how to get box predictions from the trained PICK-pytorch model. Please let me know is there any way i can include in the script and obtain the predicted bbox after training the model. Hoping for your reply at the earliest |
hi ziodos, i did not get your input for the question i asked above.Please let me know if any possibility exists.even i will code if possibility exists. Pushpa. |
Hi authors, |
For OCR (include detection and recognition): For layout analysis (e.g., PICK): |
Hi tengerye |
hi mrtranducdung, PICK model does not inference automatically by giving just an image as input. along with the image corresponding bbox, text transcripts annotations must be provided, during prediction. No need to give class label it predicts the class label automatically through model. The only possibility would be to apply the OCR techniques if need to predict with only input image. pls refer to the articles below for implementing auto inference. In below notebook refer the inference code section. 1)https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Fine_tuning_LayoutLMForTokenClassification_on_FUNSD.ipynb#scrollTo=vm3sGnBsL64o Also iam sending the code which i implemented , but followed most of the logic from above notebook to perform automatic inference using layoutLM transformer model. so similarly u can modify the logic for PICK transformer model. regards, |
Hi pushpalatha1405, |
hi mrtranducdung, Can you give bit more details on ocr model u r using to get (bbox , transcripts) at prediction stage, if its ok for you to share the details. regards, |
Hi pushpalatha,
|
Got it! Thanks very much mrtranducdung.. regards, |
hi mrtranducdung, iam revisiting this issue, where u use tesseractocr model to extract bbox and transcripts and then convert to PICK annoformat ,which can be used for model auto inference... my questions include the following: a)complexity of document like what if the document is very complex structure like utility bills? does really PICK can predict fields appropriately if above logic is used...because if annotation is created in some form for these n utility bills (word wise or sentense wise due to the document complex structure and huge number) ,do u still comment saying the above solution applying ocr model, accessing box & transcripts,convert to pick anno form,perform auto model inefernce.how well it goes? pls Share your experience on this ,because i need your input as iam building a fledge robust auto inference pipeline using pytesseract or easyocr model...but i end up in delimma due to the complex document structure. if any other solution exist to build auto inference using PICK...pls can u share. awaiting for your reply. regards, |
Hi wenwenyu,
i prepared my custom dataset according to the PICK-pytorch form and trained using the models used in PICK-pytorch .The below is the train score for 100 epochs around 69% and also test score around mEF 0.7150 using below test.py script(from PICK_pytorch)
'''
python test.py --checkpoint /datadrive/PICK-pytorch/saved/models/PICK_Default/test_0924_145754/checkpoint-epoch100.pth --boxes_transcripts /datadrive/PICK-pytorch/predictions/boxes_and_transcripts --images_path /datadrive/PICK-pytorch/predictions/images --output_folder /datadrive/PICK-pytorch/output_pred --batch_size 1 --gpu 0
'''
Now my question is when iam building end to end inference pipeline then i need to provide only the image and checkpoint-epoch100.pth file ,then i must get the corresponding entities extractions inform of json /txt file and bounding box coordinates.
But why i need to provide again box_and_transcripts annotation during inference?
Is there any way where i can use PICK-pytorch for inferencing by automatically give the checkpoint file and image path and get the predictions in form of text and images with bounding box.
pls let me know if any solutions exist i want to use PICK-pytorch model( after this much progress where i have trained tested on my custom dataset) in our product, but the barrier is passing box_and transcripts to test.py.
Hoping for the reply at the earliest
regards,
Pushpalatha M
The text was updated successfully, but these errors were encountered: