Deploying on Kubernetes


You can leverage your Kubernetes cluster as a substrate for execution of distributed Ray programs. The Ray Autoscaler spins up and deletes Kubernetes Pods according to the resource demands of the Ray workload. Each Ray node runs in its own Kubernetes Pod.

The Ray Kubernetes Operator

Deployments of Ray on Kubernetes are managed by the Ray Kubernetes Operator. The Ray Operator follows the standard Kubernetes Operator pattern. The main players are

  • A Custom Resource called a RayCluster, which describes the desired state of the Ray cluster.

  • A Custom Controller, the Ray Operator, which processes RayCluster resources and manages the Ray cluster.

Under the hood, the Operator uses the Ray Autoscaler to launch and scale your Ray cluster.

The rest of this document explains how to launch a small example Ray cluster on Kubernetes.

Installing the Ray Operator with Helm

Ray provides a Helm chart to simplify deployment of the Ray Operator and Ray clusters.

The Ray Helm chart is available as part of the Ray GitHub repository. The chart will be published to a public Helm repository as part of a future Ray release.


To run the default example in this document, make sure your Kubernetes cluster can accomodate additional resource requests of 4 CPU and 2.5Gi memory.


You can install a small Ray cluster with a single helm command. The default cluster configuration consists of a Ray head pod and two worker pods, with scaling allowed up to three workers.

# Navigate to the directory containing the chart
$ cd ray/deploy/charts

# Install a small Ray cluster with the default configuration
# in a new namespace called "ray". Let's name the Helm release "example-cluster."
$ helm -n ray install example-cluster --create-namespace ./ray
NAME: example-cluster
LAST DEPLOYED: Fri May 14 11:44:06 2021
STATUS: deployed

View the installed resources as follows.

# The custom resource representing the state of the Ray cluster.
$ kubectl -n ray get rayclusters
NAME              STATUS    RESTARTS   AGE
example-cluster   Running   0          53s

# The Ray head node and two Ray worker nodes.
$ kubectl -n ray get pods
NAME                                    READY   STATUS    RESTARTS   AGE
example-cluster-ray-head-type-5926k     1/1     Running   0          57s
example-cluster-ray-worker-type-8gbwx   1/1     Running   0          40s
example-cluster-ray-worker-type-l6cvx   1/1     Running   0          40s

# A service exposing the Ray head node.
$ kubectl -n ray get service
NAME                       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                       AGE
example-cluster-ray-head   ClusterIP   <none>        10001/TCP,8265/TCP,8000/TCP   115s

# The operator deployment.
# By default, the deployment is launched in namespace "default".
$ kubectl get deployment ray-operator
ray-operator   1/1     1            1           3m1s

# The single pod of the operator deployment.
$ kubectl get pod -l
NAME                            READY   STATUS    RESTARTS   AGE
ray-operator-84f5d57b7f-xkvtm   1/1     Running   0          3m35

# The Custom Resource Definition defining a RayCluster.
$ kubectl get crd
NAME                         CREATED AT   2021-05-14T18:44:02


To view autoscaling logs, run a kubectl logs command on the operator pod:

# The last 100 lines of logs.
$ kubectl logs \
  $(kubectl get pod -l -o \
  | tail -n 100

The Ray dashboard can be accessed on the Ray head node at port 8265.

# Forward the relevant port from the service exposing the Ray head.
$ kubectl -n ray port-forward service/example-cluster-ray-head 8265:8265

# The dashboard can now be viewed in a browser at http://localhost:8265

Running Ray programs with Ray Jobs Submission

Ray Job Submission can be used to submit Ray programs to your Ray cluster. To do this, you must be able to access the Ray Dashboard, which runs on the Ray head node on port 8265. One way to do this is to port forward on your local machine to on the head node using the Kubernetes port-forwarding command.

$ kubectl -n ray port-forward service/example-cluster-ray-head 8265:8265

Then in a new shell, you can run a job using the CLI:

$ export RAY_ADDRESS=""

$ ray job submit --runtime-env-json='{"working_dir": "./", "pip": ["requests==2.26.0"]}' -- python
2021-12-01 23:04:52,672 INFO -- Creating JobSubmissionClient at address:
2021-12-01 23:04:52,809 INFO -- Uploading package gcs://
2021-12-01 23:04:52,810 INFO -- Creating a file package for local directory './'.
2021-12-01 23:04:52,878 INFO -- Job submitted successfully: raysubmit_RXhvSyEPbxhcXtm6.
2021-12-01 23:04:52,878 INFO -- Query the status of the job using: `ray job status raysubmit_RXhvSyEPbxhcXtm6`.

For more ways to run jobs, including a Python SDK and a REST API, see Ray Job Submission.

Running Ray programs with Ray Client

Ray Client can be used to interactively execute Ray programs on your Ray cluster. The Ray Client server runs on the Ray head node, on port 10001.


Connecting with Ray client requires using matching minor versions of Python (for example 3.7) on the server and client end, that is, on the Ray head node and in the environment where ray.init("ray://<host>:<port>") is invoked. Note that the default rayproject/ray images use Python 3.7. The latest offical Ray release builds are available for Python 3.6 and 3.8 at the Ray Docker Hub.

Connecting with Ray client also requires matching Ray versions. To connect from a local machine to a cluster running the examples in this document, the latest release version of Ray must be installed locally.

Using Ray Client to connect from outside the Kubernetes cluster

One way to connect to the Ray cluster from outside your Kubernetes cluster is to forward the Ray Client server port:

$ kubectl -n ray port-forward service/example-cluster-ray-head 10001:10001

Then open a new shell and try out a sample Ray program:

$ python ray/doc/kubernetes/example_scripts/

The program in this example uses ray.init("ray://") to connect to the Ray cluster. The program waits for three Ray nodes to connect and then tests object transfer between the nodes.

Using Ray Client to connect from within the Kubernetes cluster

You can also connect to your Ray cluster from another pod in the same Kubernetes cluster.

For example, you can submit a Ray application to run on the Kubernetes cluster as a Kubernetes Job. The Job will run a single pod running the Ray driver program to completion, then terminate the pod but allow you to access the logs.

The following command submits a Job which executes an example Ray program.

$ kubectl -n ray create -f
job.batch/ray-test-job created

The program executed by the job uses the name of the Ray cluster’s head Service to connect: ray.init("ray://example-cluster-ray-head:10001"). The program waits for three Ray nodes to connect and then tests object transfer between the nodes.

To view the output of the Job, first find the name of the pod that ran it, then fetch its logs:

$ kubectl -n ray get pods
NAME                                    READY   STATUS    RESTARTS   AGE
example-cluster-ray-head-type-5926k     1/1     Running   0          21m
example-cluster-ray-worker-type-8gbwx   1/1     Running   0          21m
example-cluster-ray-worker-type-l6cvx   1/1     Running   0          21m
ray-test-job-dl9fv                      1/1     Running   0          3s

# Fetch the logs. You should see repeated output for 10 iterations and then
# 'Success!'
$ kubectl -n ray logs ray-test-job-dl9fv

# Cleanup
$ kubectl -n ray delete job ray-test-job
job.batch "ray-test-job" deleted


Code dependencies for a given Ray task or actor must be installed on each Ray node that might run the task or actor. Typically, this means that all Ray nodes need to have the same dependencies installed. To achieve this, you can build a custom container image, using one of the official Ray images as the base. Alternatively, try out the experimental Runtime Environments API (latest Ray release version recommended.)


To remove a Ray Helm release and the associated API resources, use kubectl delete and helm uninstall. Note the order of the commands below.

# First, delete the RayCluster custom resource.
$ kubectl -n ray delete raycluster example-cluster "example-cluster" deleted

# Delete the Ray release.
$ helm -n ray uninstall example-cluster
release "example-cluster" uninstalled

# Optionally, delete the namespace created for our Ray release.
$ kubectl delete namespace ray
namespace "ray" deleted

Note that helm uninstall does not delete the RayCluster CRD. If you wish to delete the CRD, make sure all Ray Helm releases have been uninstalled, then run kubectl delete crd

Questions or Issues?

You can post questions or issues or feedback through the following channels:

  1. Discussion Board: For questions about Ray usage or feature requests.

  2. GitHub Issues: For bug reports.

  3. Ray Slack: For getting in touch with Ray maintainers.

  4. StackOverflow: Use the [ray] tag questions about Ray.