Start an Aliyun ACK cluster with GPUs for KubeRay#

This guide provides step-by-step instructions for creating an ACK cluster with GPU nodes specifically configured for KubeRay. The configuration outlined here can be applied to most KubeRay examples found in the documentation.

Step 1: Create a Kubernetes cluster on Aliyun ACK#

See Create a cluster to create a Aliyun ACK cluster and see Connect to clusters to configure your computer to communicate with the cluster.

Step 2: Create node pools for the Aliyun ACK cluster#

See Create a node pool to create node pools.

Manage node labels and taints#

If you need to set taints for nodes, see Create and manage node labels and Create and manage node taints. For example, you can add a taint to GPU node pools so that Ray won’t schedule head pods on these nodes.

Upgrade drivers on the nodes#

If you need to upgrade the drivers on the nodes, see Step 2: Create a node pool and specify an NVIDIA driver version to upgrade drivers.

Step 3: Install KubeRay addon in the cluster#

See Step 2: Install KubeRay-Operator to deploy KubeRay in ACK.