Skip to content

Latest commit

 

History

History
187 lines (132 loc) · 5.98 KB

File metadata and controls

187 lines (132 loc) · 5.98 KB

Inference of MNIST using MXNet on Amazon EKS

This document explains how to perform inference of MNIST model using Apache MXNet Model Server (MMS) on Amazon EKS. MMS is a flexible and easy to use tool for serving deep learning models trained by MXNet.

Pre-requisite

Create EKS cluster using GPU.

Run inference using EKS

In order to run MNIST inferene on EKS, we need to have Docker image and k8s manifest to create inference service backed by deployment.

  1. You can either create a docker image from file samples/mnist/inference/mxnet/Dockerfile or use an existing image rgaut/deeplearning-mxnet:inference.

    The MXNet model is bundled with the Docker image.

  2. Create deployment and service for inference:

    kubectl create -f samples/mnist/inference/mxnet/mxnet_eks.yaml
    

    Check for the deployment to run:

    kubectl get pods --selector=app=mnist-service -w
    NAME                             READY   STATUS              RESTARTS   AGE
    mnist-service-7df4759f74-xhj5x   0/1     ContainerCreating   0          29s
    mnist-service-7df4759f74-xhj5x   1/1     Running             0          46s
    
  3. Service is exposed as clusterIP. Use port forwarding so that the service can be accessed locally:

    kubectl port-forward \
       `kubectl get pods --selector=app=mnist-service -o jsonpath='{.items[0].metadata.name}'` \
       8080:8080 &
    
  4. Run the inference:

    curl -X POST localhost:8080/predictions/mnist -T samples/mnist/inference/mxnet/utils/9.png
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
    100  8042  100    56  100  7986   3105   432k --:--:-- --:--:-- --:--:--  458k
    Prediction is [9] with probability of 92.52161979675293%
    

    Run another inference:

    curl -X POST localhost:8080/predictions/mnist -T samples/mnist/inference/mxnet/utils/7.jpg
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100   608  100    52  100   556    568   6081 --:--:-- --:--:-- --:--:--  6109
    Prediction is [7] with probability of 99.9999761581%
    

Run inference using MXNet Model Server locally

Install MXNet Model Server

  1. Install Java:

    brew tap caskroom/versions
    brew update
    brew cask install java8
    
  2. Setup a virtual environment:

    pip install virtualenv --user
    export PATH=~/Library/Python/2.7/bin:$PATH
    # create a Python2.7 virtual environment
    virtualenv -p /usr/bin/python /tmp/pyenv2
    # Enter this virtual environment
    source /tmp/pyenv2/bin/activate
    

    Location of virtualenv binary may be different. This can be found using pip show virtualenv command.

  3. Install MXNet Model Server for CPU inference:

    pip install mxnet-mkl
    
  4. Install MXNet Model Server:

    pip install mxnet-model-server
    

Prepare model archive

Model Archive is an artifact that MMS can consume natively. This archive package can be easily created with the trained artifacts. A copy of this archive is available at samples/mnist/inference/archived_model/mnist_cnn.mar.

Skip rest of the section if you are using the pre-generated archive. This section explains how to generate MMS archive from the artifacts produced by model training.

  1. Two artifacts were generated at end of the training - symbols file (mnist_cnn-symbol.json) and a params file (mnist_cnn-0000.params). These artifacts are provided in the saved_model directory. Copy these artifacts to /tmp/models directory.

    mkdir /tmp/models
    cp samples/mnist/training/mxnet/saved_model/mnist_cnn-* /tmp/models
    
  2. model-archiver tool is also installed as part of MMS installation. It can be manually installed:

    pip install model-archiver
    
  3. Create a model-store location under tmp:

    mkdir /tmp/model-store
    
  4. Copy the ../../../samples/mnist/inference/mxnet/mnist_cnn_inference.py to /tmp/models directory:

    cp samples/mnist/inference/mxnet/mnist_cnn_inference.py /tmp/models
    
  5. Generate model archive:

    model-archiver \
    	--model-name mnist_cnn \
    	--model-path /tmp/models \
    	--export-path /tmp/model-store \
    	--handler mnist_cnn_inference:handle -f
    

    This command creates an model archive called mnist_cnn.mar under /tmp/model-store.

Run inference

  1. Update ~/.keras/keras.json so that it looks like:

    {
        "epsilon": 1e-07, 
        "floatx": "float32", 
        "image_data_format": "channels_last", 
        "backend": "mxnet"
    }
    

    This is to ensure that the backend is mxnet and image_data_format is channels_last.

  2. Run MXNet Model Server:

    mxnet-model-server \
    --start \
    --model-store samples/mnist/inference/mxnet/archived_model \
    --models mnist=mnist_cnn.mar
    

    The above command creates an endpoint called mnist.

    If you generated your own archive at /tmp/model-store, then make sure to specify that directory as parameter to --model-store.

  3. In a new terminal, run the inference:

    curl -X POST localhost:8080/predictions/mnist -T samples/mnist/inference/mxnet/utils/9.png
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
    100  8042  100    56  100  7986   3105   432k --:--:-- --:--:-- --:--:--  458k
    Prediction is [9] with probability of 92.52161979675293%
    

    Run another inference:

    curl -X POST localhost:8080/predictions/mnist -T samples/mnist/inference/mxnet/utils/7.jpg
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100   608  100    52  100   556    568   6081 --:--:-- --:--:-- --:--:--  6109
    Prediction is [7] with probability of 99.9999761581%