Ray Clusters Overview

What is a Ray cluster?

One of Ray’s strengths is the ability to leverage multiple machines for distributed execution. Ray is great for multiprocessing on a single machine. However, the real power of Ray is the ability to seamlessly scale to a cluster of machines.

A Ray cluster is a set of one or more nodes that are running Ray and share the same head node. Ray clusters can either be a fixed-size number of nodes or can autoscale (i.e., automatically provision or deprovision the number of nodes in a cluster) according to the demand of the Ray workload.

How can I use Ray clusters?

Ray clusters are officially supported on the following technology stacks:

Advanced users may want to deploy Ray clusters on-premise or onto infrastructure platforms not listed here by providing a custom node provider.

Where to go from here?

I want to learn key Ray cluster concepts

Understand the key concepts and main ways of interacting with a Ray cluster.

I want to run Ray on a cloud provider

Take a sample application designed to run on a laptop and scale it up in the cloud. Access to an AWS or GCP account is required.

I want to run Ray on Kubernetes

Deploy a Ray application to a Kubernetes cluster. You can run the tutorial on a remote Kubernetes cluster or on your laptop via KinD.