Serve a StableDiffusion text-to-image model on Kubernetes#
Note: The Python files for the Ray Serve application and its client are in the ray-project/serve_config_examples repository and the Ray documentation.
Step 1: Create a Kubernetes cluster with GPUs#
Follow aws-eks-gpu-cluster.md or gcp-gke-gpu-cluster.md to create a Kubernetes cluster with 1 CPU node and 1 GPU node.
Step 2: Install KubeRay operator#
Follow this document to install the latest stable KubeRay operator using the Helm repository.
Note that the YAML file in this example uses serveConfigV2
. This feature requires KubeRay v0.6.0 or later.
Step 3: Install a RayService#
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/master/ray-operator/config/samples/ray-service.stable-diffusion.yaml
This RayService configuration contains some important settings:
In the RayService, the head Pod doesn’t have any
tolerations
. Meanwhile, the worker Pods use the followingtolerations
so the scheduler won’t assign the head Pod to the GPU node.# Please add the following taints to the GPU node. tolerations: - key: "ray.io/node-type" operator: "Equal" value: "worker" effect: "NoSchedule"
It includes
diffusers
inruntime_env
since this package isn’t included by default in theray-ml
image.
Step 4: Forward the port of Serve#
First get the service name from this command.
kubectl get services
Then, port forward to the serve.
# Wait until the RayService `Ready` condition is `True`. This means the RayService is ready to serve.
kubectl describe rayservices.ray.io stable-diffusion
# [Example output]
# Conditions:
# Last Transition Time: 2025-02-13T07:10:34Z
# Message: Number of serve endpoints is greater than 0
# Observed Generation: 1
# Reason: NonZeroServeEndpoints
# Status: True
# Type: Ready
# Forward the port of Serve
kubectl port-forward svc/stable-diffusion-serve-svc 8000
Step 5: Send a request to the text-to-image model#
# Step 5.1: Download `stable_diffusion_req.py`
curl -LO https://raw.githubusercontent.com/ray-project/serve_config_examples/master/stable_diffusion/stable_diffusion_req.py
# Step 5.2: Set your `prompt` in `stable_diffusion_req.py`.
# Step 5.3: Send a request to the Stable Diffusion model.
python stable_diffusion_req.py
# Check output.png
You can refer to the document “Serving a Stable Diffusion Model” for an example output image.