RayService Quickstart#
Prerequisites#
This guide mainly focuses on the behavior of KubeRay v1.3.0 and Ray 2.41.0.
What’s a RayService?#
A RayService manages these components:
RayCluster: Manages resources in a Kubernetes cluster.
Ray Serve Applications: Manages users’ applications.
What does the RayService provide?#
Kubernetes-native support for Ray clusters and Ray Serve applications: After using a Kubernetes configuration to define a Ray cluster and its Ray Serve applications, you can use
kubectl
to create the cluster and its applications.In-place updating for Ray Serve applications: See RayService for more details.
Zero downtime upgrading for Ray clusters: See RayService for more details.
High-availabilable services: See RayService high availability for more details.
Example: Serve two simple Ray Serve applications using RayService#
Step 1: Create a Kubernetes cluster with Kind#
kind create cluster --image=kindest/node:v1.26.0
Step 2: Install the KubeRay operator#
Follow this document to install the latest stable KubeRay operator from the Helm repository.
Note that the YAML file in this example uses serveConfigV2
to specify a multi-application Serve configuration, available starting from KubeRay v0.6.0.
Step 3: Install a RayService#
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-service.sample.yaml
Step 4: Verify the Kubernetes cluster status#
# List all RayService custom resources in the `default` namespace.
kubectl get rayservice
NAME SERVICE STATUS NUM SERVE ENDPOINTS
rayservice-sample Running 2
# List all RayCluster custom resources in the `default` namespace.
kubectl get raycluster
NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE
rayservice-sample-raycluster-czjtm 1 1 2500m 4Gi 0 ready 4m21s
# List all Ray Pods in the `default` namespace.
kubectl get pods -l=ray.io/is-ray-node=yes
NAME READY STATUS RESTARTS AGE
rayservice-sample-raycluster-czjtm-head-ldxl7 1/1 Running 0 4m21s
rayservice-sample-raycluster-czjtm-small-group-worker-pk88k 1/1 Running 0 4m21s
# Check the `Ready` condition of the RayService.
# The RayService is ready to serve requests when the condition is `True`.
# Users can also use `kubectl describe rayservices.ray.io rayservice-sample` to check the condition section
kubectl get rayservice rayservice-sample -o json | jq -r '.status.conditions[] | select(.type=="Ready") | to_entries[] | "\(.key): \(.value)"'
lastTransitionTime: 2025-04-11T16:17:01Z
message: Number of serve endpoints is greater than 0
observedGeneration: 1
reason: NonZeroServeEndpoints
status: True
type: Ready
# List services in the `default` namespace.
kubectl get services -o json | jq -r '.items[].metadata.name'
kuberay-operator
kubernetes
rayservice-sample-head-svc
rayservice-sample-raycluster-czjtm-head-svc
rayservice-sample-serve-svc
When the Ray Serve applications are healthy and ready, KubeRay creates a head service and a Ray Serve service for the RayService custom resource. For example, rayservice-sample-head-svc
and rayservice-sample-serve-svc
.
Step 5: Verify the status of the Serve applications#
# (1) Forward the dashboard port to localhost.
# (2) Check the Serve page in the Ray dashboard at http://localhost:8265/#/serve.
kubectl port-forward svc/rayservice-sample-head-svc 8265:8265 > /dev/null &
Refer to rayservice-troubleshooting.md for more details on RayService observability. Below is a screenshot example of the Serve page in the Ray dashboard.
Step 6: Send requests to the Serve applications by the Kubernetes serve service#
# Step 6.1: Run a curl Pod.
# If you already have a curl Pod, you can use `kubectl exec -it <curl-pod> -- sh` to access the Pod.
kubectl run curl --image=radial/busyboxplus:curl --command -- tail -f /dev/null
# Step 6.3: Send a request to the calculator app.
kubectl exec curl -- curl -sS -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/calc/ -d '["MUL", 3]'
15 pizzas please!
# Step 6.2: Send a request to the fruit stand app.
kubectl exec curl -- curl -sS -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc:8000/fruit/ -d '["MANGO", 2]'
6
Step 7: Clean up the Kubernetes cluster#
# Kill the `kubectl port-forward` background job in the earlier step
killall kubectl
kind delete cluster
Next steps#
See RayService document for the full list of RayService features, including in-place update, zero downtime upgrade, and high-availability.
See RayService troubleshooting guide if you encounter any issues.
See Examples for more RayService examples. The MobileNet example is a good example to start with because it doesn’t require GPUs and is easy to run on a local machine.