This is an end-to-end tuberculosis classification tool that runs on Google Cloud.
It contains
- model training pipeline using a DenseNet121 architecture was built. Then the model is dockerized to be put on GCP.
- simple front-end where chest X-rays are given as input
- back-end
- a cloud run to call the model with the input image
- the dockerized model which is used by the cloud run to get the predictions
demo3.mov
https://github.com/pyenv/pyenv-installer
On macOS you can use brew, but you may need to grab the --HEAD
version for the latest:
brew install pyenv --HEAD
or
curl https://pyenv.run | bash
And then you should check the local .python-version
file or .envrc
and install the correct version which will be the basis for the local virtual environment. If the .python-version
exists you can run:
pyenv install
This will show a message like this if you already have the right version, and you can just respond with N
(No) to cancel the re-install:
pyenv: ~/.pyenv/versions/3.8.6 already exists
continue with installation? (y/N) N
https://direnv.net/docs/installation.html
curl -sfL https://direnv.net/install.sh | bash
If you are a new developer to this package and need to develop, test, or build -- please run the following to create a developer-ready local Virtual Environment:
direnv allow
python --version
pip install --upgrade pip
pip install poetry
poetry install
In a more professional settings, this repo should be split into multiple pieces. A good split may be
- model development repo
- inference + dockerization + deployment repo
- frontend
Because model development dependencies will make the inference pipeline unnecessarily larger.
Similarly, frontend dependencies (currently flask
and flask-cors
) make the docker image larger.
To prevent this issue, I made all the model development libraries as dev
dependencies. So that
they are not included in the docker image. But this is just a trick, not the preferred way.
The data for this model training is obtained from Kaggle:
https://www.kaggle.com/datasets/tawsifurrahman/tuberculosis-tb-chest-xray-dataset
Kudos to the team and their amazing work for collecting such a useful dataset!