Embark on your journey with this custom storage initializer by exploring a simple hello-world example. Learn how to seamlessly integrate and leverage the power of this tool in just a few steps.
- Install Kind (Kubernetes in Docker) to run local Kubernetes cluster with Docker container nodes.
- Install the Kubernetes CLI (kubectl), which allows you to run commands against Kubernetes clusters.
- Install the Kustomize, which allows you to customize app configuration.
We assume all prerequisites are satisfied at this point.
-
After having kind installed, create a kind cluster with:
kind create cluster
-
Configure
kubectl
to use kind contextkubectl config use-context kind-kind
-
Setup local deployment of Kserve using the provided Kserve quick installation script
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.14/hack/quick_install.sh" | bash
-
Install model registry in the local cluster
[Optional] Use local model registry container image:
TAG=$(git rev-parse HEAD) && \ MR_IMG=quay.io/$USER/model-registry:$TAG && \ make -C ../ IMG_ORG=$USER IMG_VERSION=$TAG image/build && \ kind load docker-image $MR_IMG
or
TAG=... && \ MR_IMG=kubeflow/model-registry:$TAG
then:
bash ./scripts/install_modelregistry.sh -i $MR_IMG
Note
The ./scripts/install_modelregistry.sh
will make some change to base/kustomization.yaml that you DON'T need to commit!!
-
[Optional] Use local CSI container image
Either, using the local model-registry library as dependency:
IMG=$USER/model-registry-storage-initializer:$(git rev-parse HEAD) && \ make IMG=$IMG docker-build-dev && \ kind load docker-image $IMG
Or using a fixed version of model-registry library:
IMG=$USER/model-registry-storage-initializer:$(git rev-parse HEAD) && \ make IMG=$IMG docker-build && \ kind load docker-image $IMG
In this tutorial, you will deploy an InferenceService with a predictor that will load a model indexed into the model registry, the indexed model refers to a scikit-learn model trained with the iris dataset. This dataset has three output class: Iris Setosa, Iris Versicolour, and Iris Virginica.
You will then send an inference request to your deployed model in order to get a prediction for the class of iris plant your request corresponds to.
Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the storage location of the model using the model-registry://
URI format and it gets some super powers out of the box.
Apply Port Forward
to the model registry service in order to being able to interact with it from the outside of the cluster.
kubectl port-forward --namespace kubeflow svc/model-registry-service 8080:8080
And then (in another terminal):
export MR_HOSTNAME=localhost:8080
Then, in the same terminal where you exported MR_HOSTNAME
, perform the following actions:
-
Register an empty
RegisteredModel
curl --silent -X 'POST' \ "$MR_HOSTNAME/api/model_registry/v1alpha3/registered_models" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "description": "Iris scikit-learn model", "name": "iris" }'
Expected output:
{"createTimeSinceEpoch":"1709287882361","customProperties":{},"description":"Iris scikit-learn model","id":"1","lastUpdateTimeSinceEpoch":"1709287882361","name":"iris"}
-
Register the first
ModelVersion
curl --silent -X 'POST' \ "$MR_HOSTNAME/api/model_registry/v1alpha3/model_versions" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "description": "Iris model version v1", "name": "v1", "registeredModelID": "1" }'
Expected output:
{"createTimeSinceEpoch":"1709287890365","customProperties":{},"description":"Iris model version v1","id":"2","lastUpdateTimeSinceEpoch":"1709287890365","name":"v1"}
-
Register the raw
ModelArtifact
This artifact defines where the actual trained model is stored, i.e.,gs://kfserving-examples/models/sklearn/1.0/model
curl --silent -X 'POST' \ "$MR_HOSTNAME/api/model_registry/v1alpha3/model_versions/2/artifacts" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "description": "Model artifact for Iris v1", "uri": "gs://kfserving-examples/models/sklearn/1.0/model", "state": "UNKNOWN", "name": "iris-model-v1", "modelFormatName": "sklearn", "modelFormatVersion": "1", "artifactType": "model-artifact" }'
Expected output:
{"artifactType":"model-artifact","createTimeSinceEpoch":"1709287972637","customProperties":{},"description":"Model artifact for Iris v1","id":"1","lastUpdateTimeSinceEpoch":"1709287972637","modelFormatName":"sklearn","modelFormatVersion":"1","name":"iris-model-v1","state":"UNKNOWN","uri":"gs://kfserving-examples/models/sklearn/1.0/model"}
Note
Double check the provided IDs are the expected ones.
Retrieve the model registry service and MLMD port:
MODEL_REGISTRY_SERVICE=model-registry-service
MODEL_REGISTRY_REST_PORT=$(kubectl get svc/$MODEL_REGISTRY_SERVICE -n kubeflow --output jsonpath='{.spec.ports[0].targetPort}' )
Apply the cluster-scoped ClusterStorageContainer
CR to setup configure the model registry storage initilizer
for model-registry://
URI formats.
kubectl apply -f - <<EOF
apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
name: mr-initializer
spec:
container:
name: storage-initializer
image: $IMG
env:
- name: MODEL_REGISTRY_BASE_URL
value: "$MODEL_REGISTRY_SERVICE.kubeflow.svc.cluster.local:$MODEL_REGISTRY_REST_PORT"
- name: MODEL_REGISTRY_SCHEME
value: "http"
resources:
requests:
memory: 100Mi
cpu: 100m
limits:
memory: 1Gi
cpu: "1"
supportedUriFormats:
- prefix: model-registry://
EOF
Note
As $IMG
you could use either the one created during env preparation or any other remote img in the container registry.
-
Create a namespace
kubectl create namespace kserve-test
-
Create the
InferenceService
kubectl apply -n kserve-test -f - <<EOF apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "iris-model" spec: predictor: model: modelFormat: name: sklearn storageUri: "model-registry://iris/v1" EOF
-
Check
InferenceService
statuskubectl get inferenceservices iris-model -n kserve-test
-
Determine the ingress IP and ports
kubectl get svc istio-ingressgateway -n istio-system
And then:
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}') kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8081:80
After that (in another terminal):
export INGRESS_HOST=localhost export INGRESS_PORT=8081
-
Perform the inference request
Prepare the input data:
cat <<EOF > "/tmp/iris-input.json" { "instances": [ [6.8, 2.8, 4.8, 1.4], [6.0, 3.4, 4.5, 1.6] ] } EOF
If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header.
SERVICE_HOSTNAME=$(kubectl get inferenceservice iris-model -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" "http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/iris-model:predict" -d @/tmp/iris-input.json