Fast setup of OCR lambda function using Tesseract 5 and a custom OCR (here we use PaddleOCR ONNX version)
-
clone repo
-
create ECR repo in your AWS / copy its URI and add it to
zip_fct.sh
#line 27/28 -
connect if not done
aws ecr get-login-password --region yourREGION | docker login --username AWS --password-stdin yourURI
-
run
cd lambda-tesseract-api/; bash zip_fct.sh
Done ! Your ECR image is ready to be uploaded from your lambda function (you can use the example.json
to test it).
Notes :
- Docker must be installed, tested in Ubuntu 20.04.
- Here we do only the Recognition part, You can edit OCR fcts in
lambda_function.py
for your needs.
Check Medium link to setup lambda and Api in AWS console. Not updated (the lambda setup is easier now, you only need to upload the Image from ECR).