Use kubectl plugin (beta)#

Starting from KubeRay v1.3.0, you can use the kubectl ray plugin to simplify common workflows when deploying Ray on Kubernetes. If you aren’t familiar with Kubernetes, this plugin simplifies running Ray on Kubernetes.

Installation#

See KubeRay kubectl Plugin to install the plugin.

Install the KubeRay kubectl plugin using one of the following methods:

  • Install using Krew kubectl plugin manager (recommended)

  • Download from GitHub releases

Download from GitHub releases#

Go to the releases page and download the binary for your platform.

For example, to install kubectl plugin version 1.3.0 on Linux amd64:

curl -LO https://github.com/ray-project/kuberay/releases/download/v1.3.0/kubectl-ray_v1.3.0_linux_amd64.tar.gz
tar -xvf kubectl-ray_v1.3.0_linux_amd64.tar.gz
cp kubectl-ray ~/.local/bin

Replace ~/.local/bin with the directory in your PATH.

Usage#

After installing the plugin, you can use kubectl ray --help to see the available commands and options.

Examples#

Assume that you have installed the KubeRay operator. If not, follow the RayCluster Quickstart to install the latest stable KubeRay operator by Helm repository.

Example 1: RayCluster Management#

The kubectl ray create cluster command allows you to create a valid RayCluster without an existing YAML file. The default values are follows:

Parameter

Default

ray version

2.41.0

ray image

rayproject/ray:<ray version>

head CPU

2

head memory

4Gi

worker replicas

1

worker CPU

2

worker memory

4Gi

worker GPU

0

$ kubectl ray create cluster raycluster-sample
Created Ray Cluster: raycluster-sample

You can override the default values by specifying the flags. For example, to create a RayCluster with 2 workers:

$ kubectl ray create cluster raycluster-sample-2 --worker-replicas 2
Created Ray Cluster: raycluster-sample-2

By default it only creates one worker group. You can use kubectl ray create workergroup to add additional worker groups to existing RayClusters.

$ kubectl ray create workergroup example-group --ray-cluster raycluster-sample --worker-memory 5Gi

You can use kubectl ray get cluster and kubectl ray get workergroup to get the status of RayClusters and worker groups.

$ kubectl ray get cluster 
NAME                  NAMESPACE   DESIRED WORKERS   AVAILABLE WORKERS   CPUS   GPUS   TPUS   MEMORY   AGE
raycluster-sample     default     2                 2                   6      0      0      13Gi     3m56s
raycluster-sample-2   default     2                 2                   6      0      0      12Gi     3m51s

$ kubectl ray get workergroup
NAME            REPLICAS   CPUS   GPUS   TPUS   MEMORY   CLUSTER
default-group   1/1        2      0      0      4Gi      raycluster-sample
example-group   1/1        2      0      0      5Gi      raycluster-sample
default-group   2/2        4      0      0      8Gi      raycluster-sample-2

The kubectl ray session command can forward local ports to Ray resources, allowing users to avoid remembering which ports Ray resources exposes.

$ kubectl ray session raycluster-sample
Forwarding ports to service raycluster-sample-head-svc
Ray Dashboard: http://localhost:8265
Ray Interactive Client: http://localhost:10001

And then you can open http://localhost:8265 in your browser to access the dashboard.

The kubectl ray log command can download logs from RayClusters to local directories.

$ kubectl ray log raycluster-sample
No output directory specified, creating dir under current directory using resource name.
Command set to retrieve both head and worker node logs.
Downloading log for Ray Node raycluster-sample-default-group-worker-b2k7h
Downloading log for Ray Node raycluster-sample-example-group-worker-sfdp7
Downloading log for Ray Node raycluster-sample-head-k5pj8

It creates a folder named raycluster-sample in the current directory containing the logs of the RayCluster.

Use kubectl ray delete command to clean up the resources.

$ kubectl ray delete raycluster-sample
$ kubectl ray delete raycluster-sample-2

Example 2: RayJob Submission#

kubectl ray job submit is a wrapper around the ray job submit command. It can automatically forward the ports to the Ray cluster and submit the job. This command can also provision an ephemeral cluster if the user doesn’t provide a RayJob.

Assume that under the current directory, you have a file named sample_code.py.

import ray
ray.init(address="auto")

@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures)) # [0, 1, 4, 9]

Submit a Ray job without a YAML file#

You can submit a RayJob without specifying a YAML file. The command generates a RayJob based on the following:

Parameter

Default

ray version

2.41.0

ray image

rayproject/ray:<ray version>

head CPU

2

head memory

4Gi

worker replicas

1

worker CPU

2

worker memory

4Gi

worker GPU

0

$ kubectl ray job submit --name rayjob-sample --working-dir . -- python sample_code.py
Submitted RayJob rayjob-sample.
Waiting for RayCluster
...
2025-01-06 11:53:34,806	INFO worker.py:1634 -- Connecting to existing Ray cluster at address: 10.12.0.9:6379...
2025-01-06 11:53:34,814	INFO worker.py:1810 -- Connected to Ray cluster. View the dashboard at 10.12.0.9:8265 
[0, 1, 4, 9]
2025-01-06 11:53:38,368	SUCC cli.py:63 -- ------------------------------------------
2025-01-06 11:53:38,368	SUCC cli.py:64 -- Job 'raysubmit_9NfCvwcmcyMNFCvX' succeeded
2025-01-06 11:53:38,368	SUCC cli.py:65 -- ------------------------------------------

You can also designate a specific RayJob YAML to submit a Ray job.

$ wget https://raw.githubusercontent.com/ray-project/kuberay/refs/heads/master/ray-operator/config/samples/ray-job.interactive-mode.yaml

Note that in the RayJob spec, submissionMode is InteractiveMode.

$ kubectl ray job submit -f ray-job.interactive-mode.yaml --working-dir . -- python sample_code.py
Submitted RayJob rayjob-interactive-mode.
Waiting for RayCluster
...
2025-01-06 12:44:43,542	INFO worker.py:1634 -- Connecting to existing Ray cluster at address: 10.12.0.10:6379...
2025-01-06 12:44:43,551	INFO worker.py:1810 -- Connected to Ray cluster. View the dashboard at 10.12.0.10:8265 
[0, 1, 4, 9]
2025-01-06 12:44:47,830	SUCC cli.py:63 -- ------------------------------------------
2025-01-06 12:44:47,830	SUCC cli.py:64 -- Job 'raysubmit_fuBdjGnecFggejhR' succeeded
2025-01-06 12:44:47,830	SUCC cli.py:65 -- ------------------------------------------

Use kubectl ray delete command to clean up the resources.

$ kubectl ray delete rayjob/rayjob-sample
$ kubectl ray delete rayjob/rayjob-interactive-mode