Custom Docker Images#

This section helps you:

  • Extend the official Ray Docker images with your own dependencies

  • Package your Serve application in a custom Docker image instead of a runtime_env

  • Use custom Docker images with KubeRay

To follow this tutorial, make sure to install Docker Desktop and create a Dockerhub account where you can host custom Docker images.

Working example#

Create a Python file called fake.py and save the following Serve application to it:

from faker import Faker

from ray import serve


@serve.deployment
def create_fake_email():
    return Faker().email()


app = create_fake_email.bind()

This app creates and returns a fake email address. It relies on the Faker package to create the fake email address. Install the Faker package locally to run it:

% pip install Faker==18.13.0

...

% serve run fake:app

...

# In another terminal window:
% curl localhost:8000
[email protected]

This tutorial explains how to package and serve this code inside a custom Docker image.

Extending the Ray Docker image#

The rayproject organization maintains Docker images with dependencies needed to run Ray. In fact, the rayproject/ray and rayproject/ray-ml repos host Docker images for this doc. For instance, this RayService config uses the rayproject/ray:2.5.0 image hosted by rayproject/ray.

You can extend these images and add your own dependencies to them by using them as a base layer in a Dockerfile. For instance, the working example application uses Ray 2.5.0 and Faker 18.13.0. You can create a Dockerfile that extends the rayproject/ray:2.5.0 by adding the Faker package:

# File name: Dockerfile
FROM rayproject/ray:2.5.0

RUN pip install Faker==18.13.0

In general, the rayproject/ray images contain only the dependencies needed to import Ray and the Ray libraries. The rayproject/ray-ml images contain additional dependencies (e.g., PyTorch, HuggingFace, etc.) that are useful for machine learning. You can extend images from either of these repos to build your custom images.

Then, you can build this image and push it to your Dockerhub account, so it can be pulled in the future:

% docker build . -t your_dockerhub_username/custom_image_name:latest

...

% docker image push your_dockerhub_username/custom_image_name:latest

...

Make sure to replace your_dockerhub_username with your DockerHub user name and the custom_image_name with the name you want for your image. latest is this image’s version. If you don’t specify a version when you pull the image, then Docker automatically pulls the latest version of the package. You can also replace latest with a specific version if you prefer.

Adding your Serve application to the Docker image#

During development, it’s useful to package your Serve application into a zip file and pull it into your Ray cluster using runtime_envs. During production, it’s more stable to put the Serve application in the Docker image instead of the runtime_env since new nodes won’t need to dynamically pull and install the Serve application code before running it.

Use the WORKDIR and COPY commands inside the Dockerfile to install the example Serve application code in your image:

# File name: Dockerfile
FROM rayproject/ray:2.5.0

RUN pip install Faker==18.13.0

# Set the working dir for the container to /serve_app
WORKDIR /serve_app

# Copies the local `fake.py` file into the WORKDIR
COPY fake.py /serve_app/fake.py

KubeRay starts Ray with the ray start command inside the WORKDIR directory. All the Ray Serve actors are then able to import any dependencies in the directory. By COPYing the Serve file into the WORKDIR, the Serve deployments have access to the Serve code without needing a runtime_env.

For your applications, you can also add any other dependencies needed for your Serve app to the WORKDIR directory.

Build and push this image to Dockerhub. Use the same version as before to overwrite the image stored at that version.

Using custom Docker images in KubeRay#

Run these custom Docker images in KubeRay by adding them to the RayService config. Make the following changes:

  1. Set the rayVersion in the rayClusterConfig to the Ray version used in your custom Docker image.

  2. Set the ray-head container’s image to the custom image’s name on Dockerhub.

  3. Set the ray-worker container’s image to the custom image’s name on Dockerhub.

  4. Update the serveConfigV2 field to remove any runtime_env dependencies that are in the container.

A pre-built version of this image is available at shrekrisanyscale/serve-fake-email-example. Try it out by running this RayService config:

apiVersion: ray.io/v1alpha1
kind: RayService
metadata:
  name: rayservice-fake-emails
spec:
  serviceUnhealthySecondThreshold: 300
  deploymentUnhealthySecondThreshold: 300
  serveConfigV2: |
    applications:
      - name: fake
        import_path: fake:app
        route_prefix: /
  rayClusterConfig:
    rayVersion: '2.5.0' # Should match Ray version in the containers
    headGroupSpec:
      rayStartParams:
        dashboard-host: '0.0.0.0'
      template:
        spec:
          containers:
            - name: ray-head
              image: shrekrisanyscale/serve-fake-email-example:example
              resources:
                limits:
                  cpu: 2
                  memory: 2Gi
                requests:
                  cpu: 2
                  memory: 2Gi
              ports:
                - containerPort: 6379
                  name: gcs-server
                - containerPort: 8265 # Ray dashboard
                  name: dashboard
                - containerPort: 10001
                  name: client
                - containerPort: 8000
                  name: serve
    workerGroupSpecs:
      - replicas: 1
        minReplicas: 1
        maxReplicas: 1
        groupName: small-group
        rayStartParams: {}
        template:
          spec:
            containers:
              - name: ray-worker
                image: shrekrisanyscale/serve-fake-email-example:example
                lifecycle:
                  preStop:
                    exec:
                      command: ["/bin/sh","-c","ray stop"]
                resources:
                  limits:
                    cpu: "1"
                    memory: "2Gi"
                  requests:
                    cpu: "500m"
                    memory: "2Gi"