Skip to content

Commit

Permalink
Updated Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
MatsMoll committed May 22, 2024
1 parent 4154132 commit b04de6b
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 19 deletions.
47 changes: 30 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,30 @@ Furthermore, to make projects reliable, reproducable and easy to deploy will eve
- Data quality management using Aligned
- Data annotation using Aligned

## Software Development Capabilities
- Complete local development
- Containerized development for easier deployment
- Hot reload updatest to orchestation pipelines
- Hot reload model serving on promotion
- CI with unit and integration test on PRs
- CI warning for potential drift detection

## Intended Development Flow

When starting a new AI project, here is the intended development flow.

### 1. Spin up the infra
First we need to start the needed infrastructure to experiment and run our projects.

Run `make infra-up` this spins up the following:
- `prefect` as the orchetrator that trigger training pipelines. [http://127.0.0.1:4201](http://127.0.0.1:4201)
- `prefect-worker` local workers that run the pipelines. Reloads on file saves.
- `mlflow` as the model registry and the experiment tracker. [http://127.0.0.1:7050](http://127.0.0.1:7050)
- `aligned` as the data catalog and data / ml monitoring. [http://127.0.0.1:9000](http://127.0.0.1:9000)

You are now ready to develop a new model.

### 1. Formulate the AI problem, and the expected AI output.
### 2. Formulate the AI problem, and the expected AI output.

E.g. if we want to predict if a review is either positive or negative, it could look something like the following:

Expand All @@ -42,7 +60,7 @@ class MovieReviewIsNegative:
```


### 2. Find relevant features.
### 3. Find relevant features.
We can use the aligned UI to find which features that could be interesting for our ML use-case.

![Find features in Aligned UI](assets/find-features.png)
Expand Down Expand Up @@ -82,34 +100,37 @@ class WineIsHighQuality:
...
```

### 3. Create a training pipeline
### 4. Create a training pipeline

Create your own pipeline logic, or maybe reuse the generic classification pipeline, `classifier_from_train_test_set` which is located at `src/pipelines/train.py`.

Remember to add the pipeline to `src/pipelines/available.py` to make it visible in the Prefect UI.

### 4. Train the model
### 5. Train the model

Train the model by using the Prefect UI.

![Prefect UI training run](assets/prefect-train-model.png)

### 5. Manage the Models
### 6. Manage the Models

View training runs and manage which models should be deployed, through MLFlow.

![Track Models](assets/track-training-runs.png)

### 6. Spin up a serving end-point
### 7. Spin up a serving end-point

Figure out what the url of the model is and serve the ML model.

View the `wine-model` in `docker-compose.yaml` for an example.

### 7. Use the model
### 8. Use the model

To use the model, update our `model_contract` once more with where the model is exposed, and where we want to store the predictions.

Make sure you have started the models with `make models-up`.
This will reload the models when you promote a new model.

Then we can predict over different datasets, or manually inputted data.

![Predict over data](assets/predict-over-data.png)
Expand Down Expand Up @@ -140,7 +161,7 @@ class MoviewReviewIsNegative:
)
```

### 8. Evaluate Online Predictions
### 9. Evaluate Online Predictions
Lastly, we can start evaluating online predictions whenever we recive new ground truth values.

This can also be done through the aligned UI in the evaluation tab.
Expand All @@ -156,18 +177,10 @@ class MoviewReviewIsNegative:

![Evaluate Models](assets/evaluate-model.png)

## Get Started
## Other make commands

This projects contain a simple `Makefile` to simplify the development.

### `make infra-up`

This spins up some basic infrastructure.
- Experiment Tracking server at [localhost:7999](http://localhost:7999)
- Data catalog at [localhost:8503](http://localhost:8503)
- Workflow orchestrator at [localhost:4201](http://localhost:4201)
- A worker node runs pipelines - refreshes on file save

### `make models-up`
Spins up the differnet trained MLFlow models.

Expand Down
4 changes: 2 additions & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ services:
image: project-base
command: "mlflow server --backend-store-uri file:///app/mlflow-server/experiments --artifacts-destination file:///app/mlflow-server/artifacts --host 0.0.0.0 --port 8000"
ports:
- 7999:8000
- 7050:8000
volumes:
- ./src:/app/src
- ./mlflow:/app/mlflow-server
Expand Down Expand Up @@ -69,7 +69,7 @@ services:
- ./data:/app/data
- ./load_contracts.py:/app/custom_store.py
ports:
- 8503:8501
- 9000:8501
extra_hosts:
- host.docker.internal:host-gateway

Expand Down

0 comments on commit b04de6b

Please sign in to comment.