Skip to content

blublinsky/ray-serve

Repository files navigation

Caveats of using KubeRay cluster for Ray Serve

Here we describe the main caveats of using KubeRay for deploying and running Ray Serve. It is mostly based on this documentation

Install Kuberay operator

Install Kuberay operator following documentation

kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.6.0&timeout=90s"

Configuration of the cluster itself

Unfortunately usage of Ray serve requires a bit of specific cluster configuration. The example of of such configuration is here. The most important Serve specific things there are:

  • Line 20 - defining dashboard-agent-listen-port, that determines the port for serve management APIs
  • Lines 51-52 defining ports for dashboard agent.

With this in place, we can use dashboard-agent-listen-port for accessing serve APIs. We can either port-forward or create additional route for accessing it.

Implementing Serve code

We have here to serve examples - hello and fruit borrowed from Ray documentation

Deploying code

Once the code is created, we need to:

For our example the commands look like follows:

serve build hello:graph -o hello.yaml
serve build fruit:deployment_graph -o fruit.yaml

These 2 commands will produce yaml files here and here The base yaml files presented here can be further enchanced based on documentation. Most common overrides include number of replicas, and deployment parameters.

Deploying to Ray cluster

Once yaml files are in place we can use serve deploy to deploy them. Serve deploy is a thin wrapper over HTTP APIs, that can be used directly. Definitions of the Rest APIs can be found here

For our example we first do port-forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 52365 -n max

And then use the following commands:

serve deploy hello.yaml
serve deploy fruit.yaml

The newer Rest APIs allow for supporting serve applications and Allow to deploy both serve applications

Once the application is installed you can also see configuration in the Ray dashboard

Accessing applications

Following this, do port-forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 8000 -n max

And then use this command:

curl -H "Content-Type: application/json" -d '["PEAR", 2]' "http://localhost:8000/"
curl "http://localhost:8000/?name=Ray"

Undeploying

The only command is:

serve shutdown

Deploying multiple applications

Following documentation Ray now supports deploying multiple independent Serve applications.

To try this, we first need to modify fruit and hello to ensure that they are listening on different URLs fruit_url and hello_url

Once this is done, the following command generates deployment yaml:

serve build --multi-app fruit_url:graph hello_url:graph -o multi_app.yaml

The auto-generated application names default to app1, app2, so I changed them in generated yaml. Finally we need to add newly created python files fruit_url and hello_url to the docker file and rebuild our image.

When this is done and cluster is restarted, we can deploy our applications as follows:

serve deploy multi_app.yaml

Alternatively we can deploy using HTTP:

curl -X PUT http://localhost:52365/api/serve/applications/ -H 'Content-Type: application/json' -d '{"proxy_location": "EveryNode", "http_options": {"host": "0.0.0.0", "port": 8000},
                  "applications": [{"name": "fruit", "route_prefix": "/fruit", "import_path": "fruit_url:graph",
                                    "runtime_env": {},
                                    "deployments": [{"name": "MangoStand", "user_config": {"price": 3}},
                                                    {"name": "OrangeStand", "user_config": {"price": 2}},
                                                    {"name": "PearStand", "user_config": {"price": 4}},
                                                    {"name": "FruitMarket", "num_replicas": 2},
                                                    {"name": "DAGDriver"}]},
                                    {"name": "greet", "route_prefix": "/greet", "import_path": "hello_url:graph",
                                     "runtime_env": {},
                                     "deployments": [{"name": "Doubler"},
                                                     {"name": "HelloDeployment"},
                                                     {"name": "DAGDriver"}]}]}'

Once deployment is completed, you can port forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 8000 -n max

and run:

curl "http://localhost:8000/greet/?name=Ray"
curl -H "Content-Type: application/json" -d '["PEAR", 2]' "http://localhost:8000/fruit/"

Alternatively you can use POST. Also for curl, note a tip

In addition to port-forward, you can create a route exposing port 8000 and using it for invocation.

About

Experimenting with Ray Serve on KubeRay

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published