FYI: 機械学習の推論用REST APIサーバーをAmazon SageMakerで構築するまでに考えたこと
This is a template for deploying a FastAPI endpoint on AWS SageMaker.
This is mainly based on Seamless Integration: Deploying FastAPI ML Inference Code with SageMaker BYOC + Nginx.
Nginx serves as a high-performance reverse proxy, forwarding client requests to the appropriate backend server, such as FastAPI, enabling load balancing and distribution.
This setup enhances the ability to handle a large number of simultaneous connections efficiently.
Utilizing an asynchronous event-driven architecture, Nginx can process more requests quickly compared to FastAPI alone.
This leads to an overall improvement in performance and responsiveness.
By combining Gunicorn and Uvicorn, applications can be parallelized, leveraging multi-core CPUs to handle more requests.
Gunicorn, a WSGI application server, isn't directly compatible with FastAPI.
However, it functions as a process manager using Uvicorn's Gunicorn-compatible worker class.
Gunicorn listens on specified ports and IPs, forwarding communications to Uvicorn worker processes.
FYI: Server Workers - Gunicorn with Uvicorn
- Tool to run Docker like Docker Desktop
- I highly recommend to use OrbStack
git clone https://github.com/tamtam-fitness/fastapi-sagemaker-endpoint-template.git <new-project>
cd <new-project>
rm -rf .git
This template uses the model for word2vec created by Hironsan.
You can download and unzip the model from the above link ,then put it in the following directory.
mkdir -p opt/ml/model/
mv ~/Downloads/vector_neologd.zip opt/ml/model/
unzip opt/ml/model/vector_neologd.zip -d opt/ml/model/
The model file(opt/ml/model/model.vec
) is supposed to be located in the directory.
To start development, you are supposed to run the following command:
make setup
make enter_container
make lint
make format
If you want to run all tests, you can run the following command:
make test
If you want to run the specific test, you can run the following command:
make enter_container
poetry shell
poe test tests/{file or directory you want to test}
After serving the endpoint locally by make setup
, you can run a couple tests.
-
For viewing the swagger api, enter http://0.0.0.0:8080/docs into your search bar.
-
For testing the /ping endpoint, enter curl http://0.0.0.0:8080/ping into the console.
-
For testing the /invocations endpoint, enter curl http://0.0.0.0:8080/invocations into the console.
# Here is an example for testing the /invocations curl -X 'POST' \ 'http://localhost:8080/invocations' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{"word": "Python"}'
※ You are supposed to setup model in advance .
sh serve_local.sh
If you want to deploy a model file to AWS SageMaker, you need to compress the model as tar.gz and upload it to S3.
opt/ml/model/model.tar.gz
Plus, you need to add ENV variables when Model creation.
# you have to set prod.yml in opt/program/common/yaml_configs if you want to deploy it as a WebAPI
ENV=prod
Besides, if you want to publish it as a WebAPI, you may want to read Creating a machine learning-powered REST API with Amazon API Gateway mapping templates and Amazon SageMaker.