RayService Quickstart#

Prerequisites#

This guide mainly focuses on the behavior of KubeRay v1.3.0 and Ray 2.41.0.

What’s a RayService?#

A RayService manages these components:

  • RayCluster: Manages resources in a Kubernetes cluster.

  • Ray Serve Applications: Manages users’ applications.

What does the RayService provide?#

  • Kubernetes-native support for Ray clusters and Ray Serve applications: After using a Kubernetes configuration to define a Ray cluster and its Ray Serve applications, you can use kubectl to create the cluster and its applications.

  • In-place updating for Ray Serve applications: See RayService for more details.

  • Zero downtime upgrading for Ray clusters: See RayService for more details.

  • High-availabilable services: See RayService high availability for more details.

Example: Serve two simple Ray Serve applications using RayService#

Step 1: Create a Kubernetes cluster with Kind#

kind create cluster --image=kindest/node:v1.26.0

Step 2: Install the KubeRay operator#

Follow this document to install the latest stable KubeRay operator from the Helm repository. Note that the YAML file in this example uses serveConfigV2 to specify a multi-application Serve configuration, available starting from KubeRay v0.6.0.

Step 3: Install a RayService#

kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-service.sample.yaml

Step 4: Verify the Kubernetes cluster status#

# List all RayService custom resources in the `default` namespace.
kubectl get rayservice
NAME                SERVICE STATUS   NUM SERVE ENDPOINTS
rayservice-sample   Running          2
# List all RayCluster custom resources in the `default` namespace.
kubectl get raycluster
NAME                                 DESIRED WORKERS   AVAILABLE WORKERS   CPUS    MEMORY   GPUS   STATUS   AGE
rayservice-sample-raycluster-czjtm   1                 1                   2500m   4Gi      0      ready    4m21s
# List all Ray Pods in the `default` namespace.
kubectl get pods -l=ray.io/is-ray-node=yes
NAME                                                          READY   STATUS    RESTARTS   AGE
rayservice-sample-raycluster-czjtm-head-ldxl7                 1/1     Running   0          4m21s
rayservice-sample-raycluster-czjtm-small-group-worker-pk88k   1/1     Running   0          4m21s
# Check the `Ready` condition of the RayService.
# The RayService is ready to serve requests when the condition is `True`.
# Users can also use `kubectl describe rayservices.ray.io rayservice-sample` to check the condition section
kubectl get rayservice rayservice-sample -o json | jq -r '.status.conditions[] | select(.type=="Ready") | to_entries[] | "\(.key): \(.value)"'
lastTransitionTime: 2025-04-11T16:17:01Z
message: Number of serve endpoints is greater than 0
observedGeneration: 1
reason: NonZeroServeEndpoints
status: True
type: Ready
# List services in the `default` namespace.
kubectl get services -o json | jq -r '.items[].metadata.name'
kuberay-operator
kubernetes
rayservice-sample-head-svc
rayservice-sample-raycluster-czjtm-head-svc
rayservice-sample-serve-svc

When the Ray Serve applications are healthy and ready, KubeRay creates a head service and a Ray Serve service for the RayService custom resource. For example, rayservice-sample-head-svc and rayservice-sample-serve-svc.

Step 5: Verify the status of the Serve applications#

# (1) Forward the dashboard port to localhost.
# (2) Check the Serve page in the Ray dashboard at http://localhost:8265/#/serve.
kubectl port-forward svc/rayservice-sample-head-svc 8265:8265 > /dev/null &
  • Refer to rayservice-troubleshooting.md for more details on RayService observability. Below is a screenshot example of the Serve page in the Ray dashboard. Ray Serve Dashboard

Step 6: Send requests to the Serve applications by the Kubernetes serve service#

# Step 6.1: Run a curl Pod.
# If you already have a curl Pod, you can use `kubectl exec -it <curl-pod> -- sh` to access the Pod.
kubectl run curl --image=radial/busyboxplus:curl --command -- tail -f /dev/null
# Step 6.3: Send a request to the calculator app.
kubectl exec curl -- curl -sS -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/calc/ -d '["MUL", 3]'
15 pizzas please!
# Step 6.2: Send a request to the fruit stand app.
kubectl exec curl -- curl -sS -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/fruit/ -d '["MANGO", 2]'
6

Step 7: Clean up the Kubernetes cluster#

# Kill the `kubectl port-forward` background job in the earlier step
killall kubectl
kind delete cluster

Next steps#

  • See RayService document for the full list of RayService features, including in-place update, zero downtime upgrade, and high-availability.

  • See RayService troubleshooting guide if you encounter any issues.

  • See Examples for more RayService examples. The MobileNet example is a good example to start with because it doesn’t require GPUs and is easy to run on a local machine.