-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
👀 Integrate with Amazon Textextract #6
Comments
I think we need this asap, because google vision is not working as expected for any complex website. I am working on this. |
@shubhamofbce let me know if you need support! |
bump; very interested in testing this library out using textract output |
@plamb-viso happy to take a PR! It should be fairly straightforward as we have this somewhat abstracted. We'd also really like to test out Azure OCR as we've heard its the most performant. (Will make a separate issue for this) |
And any luck @shubhamofbce ? |
@asim-shrestha Sorry I have not update. |
No worries @shubhamofbce , did you still want to tackle this? |
Sorry, but I will not be able to work on it due to time constraint. @asim-shrestha |
I think I should be able to tackle this next week |
Hey @Loeing let me know if you you need any support on this one. |
@awtkns sorry this past week has been busier than anticipated. Have been playing around with Tarsier. Should be able to make some progress by the end of next week |
@Loeing I'm super interested in the ability to integrate with Amazon Textextract. Have you made any progress on this? Is there any chance I can be of some assistance? |
Howdy! I pulled down the code and tried my hand at integrating with AWS Textract. I ran into a small problem, Textract only returns normalized geometry data (values between 0 and 1), which differs from GCP & Azure. This seems to cause an issue with this line of the |
Does anyone have any published WIP branches available to look at? |
Currently the only OCR service tarsier supports is GoogleOCR vision. It would be good to provide another ocr service that allows textextract to be used
The text was updated successfully, but these errors were encountered: