Specify container commands for Ray head/worker Pods#

KubeRay generates a ray start command for each Ray Pod. Sometimes, you may want to execute certain commands either before or after the ray start command, or you may wish to define the container’s command yourself. This document shows you how to do that.

Part 1: Specify a custom container command, optionally including the generated ray start command#

Starting with KubeRay v1.1.0, if users add the annotation ray.io/overwrite-container-cmd: "true" to a RayCluster, KubeRay respects the container command and args as provided by the users, without including any generated command, including the ulimit and the ray start commands, with the latter stored in the environment variable KUBERAY_GEN_RAY_START_CMD.

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  annotations:
    # If this annotation is set to "true", KubeRay will respect the container `command` and `args`.
    ray.io/overwrite-container-cmd: "true"
  ...
spec:
  headGroupSpec:
    rayStartParams: {}
    # Pod template
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.8.0
          # Because the annotation "ray.io/overwrite-container-cmd" is set to "true",
          # KubeRay will overwrite the generated container command with `command` and
          # `args` in the following. Hence, you need to specify the `ulimit` command
          # by yourself to avoid Ray scalability issues.
          command: ["/bin/bash", "-lc", "--"]
          # Starting from v1.1.0, KubeRay injects the environment variable `KUBERAY_GEN_RAY_START_CMD`
          # into the Ray container. This variable can be used to retrieve the generated Ray start command.
          # Note that this environment variable does not include the `ulimit` command.
          args: ["ulimit -n 65536; echo head; $KUBERAY_GEN_RAY_START_CMD"]
          ...

The preceding example YAML is a part of ray-cluster.overwrite-command.yaml.

  • metadata.annotations.ray.io/overwrite-container-cmd: "true": This annotation tells KubeRay to respect the container command and args as provided by the users, without including any generated command. Refer to Part 2 for the default behavior if you set the annotation to “false” or don’t set it at all.

  • ulimit -n 65536: This command is necessary to avoid Ray scalability issues caused by running out of file descriptors. If you don’t set the annotation, KubeRay automatically injects the ulimit command into the container.

  • $KUBERAY_GEN_RAY_START_CMD: Starting from KubeRay v1.1.0, KubeRay injects the environment variable KUBERAY_GEN_RAY_START_CMD into the Ray container for both head and worker Pods to store the ray start command generated by KubeRay. Note that this environment variable doesn’t include the ulimit command.

    # Example of the environment variable `KUBERAY_GEN_RAY_START_CMD` in the head Pod.
    ray start --head  --dashboard-host=0.0.0.0  --num-cpus=1  --block  --metrics-export-port=8080  --memory=2147483648
    

The head Pod’s command/args looks like the following:

Command:
  /bin/bash
  -lc
  --
Args:
  ulimit -n 65536; echo head; $KUBERAY_GEN_RAY_START_CMD

Part 2: Execute commands before the generated ray start command#

If you only want to execute commands before the generated command, you don’t need to set the annotation ray.io/overwrite-container-cmd: "true". Some users employ this method to set up environment variables used by ray start.

# https://github.com/ray-project/kuberay/ray-operator/config/samples/ray-cluster.head-command.yaml
    rayStartParams:
        ...
    #pod template
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.8.0
          resources:
            ...
          ports:
            ...
          # `command` and `args` will become a part of `spec.containers.0.args` in the head Pod.
          command: ["echo 123"]
          args: ["456"]
  • spec.containers.0.command: KubeRay hard codes ["/bin/bash", "-lc", "--"] as the container’s command.

  • spec.containers.0.args contains two parts:

    • user-specified command: A string concatenates headGroupSpec.template.spec.containers.0.command and headGroupSpec.template.spec.containers.0.args together.

    • ray start command: KubeRay creates the command based on rayStartParams specified in RayCluster. The command looks like ulimit -n 65536; ray start ....

    • To summarize, spec.containers.0.args is $(user-specified command) && $(ray start command).

  • Example

    # Prerequisite: There is a KubeRay operator in the Kubernetes cluster.
    
    # Download `ray-cluster.head-command.yaml`
    curl -LO https://raw.githubusercontent.com/ray-project/kuberay/v1.2.2/ray-operator/config/samples/ray-cluster.head-command.yaml
    
    # Create a RayCluster
    kubectl apply -f ray-cluster.head-command.yaml
    
    # Check ${RAYCLUSTER_HEAD_POD}
    kubectl get pod -l ray.io/node-type=head
    
    # Check `spec.containers.0.command` and `spec.containers.0.args`.
    kubectl describe pod ${RAYCLUSTER_HEAD_POD}
    
    # Command:
    #   /bin/bash
    #   -lc
    #   --
    # Args:
    #    echo 123  456  && ulimit -n 65536; ray start --head  --dashboard-host=0.0.0.0  --num-cpus=1  --block  --metrics-export-port=8080  --memory=2147483648