RayService Quickstart#
Prerequisites#
This guide focuses solely on the Ray Serve multi-application API, which is available starting from Ray version 2.4.0. This guide mainly focuses on the behavior of KubeRay v1.1.1 and Ray 2.9.0.
Ray 2.4.0 or newer.
KubeRay 0.6.0, KubeRay nightly, or newer.
What’s a RayService?#
A RayService manages these components:
RayCluster: Manages resources in a Kubernetes cluster.
Ray Serve Applications: Manages users’ applications.
What does the RayService provide?#
Kubernetes-native support for Ray clusters and Ray Serve applications: After using a Kubernetes configuration to define a Ray cluster and its Ray Serve applications, you can use
kubectl
to create the cluster and its applications.In-place updating for Ray Serve applications: See RayService for more details.
Zero downtime upgrading for Ray clusters: See RayService for more details.
High-availabilable services: See RayService high availability for more details.
Example: Serve two simple Ray Serve applications using RayService#
Step 1: Create a Kubernetes cluster with Kind#
kind create cluster --image=kindest/node:v1.26.0
Step 2: Install the KubeRay operator#
Follow this document to install the latest stable KubeRay operator from the Helm repository.
Please note that the YAML file in this example uses serveConfigV2
to specify a multi-application Serve configuration, available starting from KubeRay v0.6.0.
Step 3: Install a RayService#
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.2.2/ray-operator/config/samples/ray-service.sample.yaml
Step 4: Verify the Kubernetes cluster status#
# Step 4.1: List all RayService custom resources in the `default` namespace.
kubectl get rayservice
# [Example output]
# NAME AGE
# rayservice-sample 2m42s
# Step 4.2: List all RayCluster custom resources in the `default` namespace.
kubectl get raycluster
# [Example output]
# NAME DESIRED WORKERS AVAILABLE WORKERS STATUS AGE
# rayservice-sample-raycluster-6mj28 1 1 ready 2m27s
# Step 4.3: List all Ray Pods in the `default` namespace.
kubectl get pods -l=ray.io/is-ray-node=yes
# [Example output]
# ervice-sample-raycluster-6mj28-worker-small-group-kg4v5 1/1 Running 0 3m52s
# rayservice-sample-raycluster-6mj28-head-x77h4 1/1 Running 0 3m52s
# Step 4.4: List services in the `default` namespace.
kubectl get services
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# ...
# rayservice-sample-head-svc ClusterIP 10.96.34.90 <none> 10001/TCP,8265/TCP,52365/TCP,6379/TCP,8080/TCP,8000/TCP 4m58s
# rayservice-sample-raycluster-6mj28-head-svc ClusterIP 10.96.171.184 <none> 10001/TCP,8265/TCP,52365/TCP,6379/TCP,8080/TCP,8000/TCP 6m21s
# rayservice-sample-serve-svc ClusterIP 10.96.161.84 <none> 8000/TCP 4m58s
When the Ray Serve applications are healthy and ready, KubeRay creates a head service and a Ray Serve service for the RayService custom resource. For example, rayservice-sample-head-svc
and rayservice-sample-serve-svc
in Step 4.4.
Step 5: Verify the status of the Serve applications#
# (1) Forward the dashboard port to localhost.
# (2) Check the Serve page in the Ray dashboard at http://localhost:8265/#/serve.
kubectl port-forward svc/rayservice-sample-head-svc 8265:8265
Refer to rayservice-troubleshooting.md for more details on RayService observability. Below is a screenshot example of the Serve page in the Ray dashboard.
Step 6: Send requests to the Serve applications by the Kubernetes serve service#
# Step 6.1: Run a curl Pod.
# If you already have a curl Pod, you can use `kubectl exec -it <curl-pod> -- sh` to access the Pod.
kubectl run curl --image=radial/busyboxplus:curl -i --tty
# Step 6.2: Send a request to the fruit stand app.
curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/fruit/ -d '["MANGO", 2]'
# [Expected output]: 6
# Step 6.3: Send a request to the calculator app.
curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/calc/ -d '["MUL", 3]'
# [Expected output]: "15 pizzas please!"
Step 7: Clean up the Kubernetes cluster#
# Delete the RayService.
kubectl delete -f https://raw.githubusercontent.com/ray-project/kuberay/v1.2.2/ray-operator/config/samples/ray-service.sample.yaml
# Uninstall the KubeRay operator.
helm uninstall kuberay-operator
# Delete the curl Pod.
kubectl delete pod curl
Next steps#
See RayService document for the full list of RayService features, including in-place update, zero downtime upgrade, and high-availability.
See RayService troubleshooting guide if you encounter any issues.
See Examples for more RayService examples. The MobileNet example is a good example to start with because it does not require GPUs and is easy to run on a local machine.