Serve a StableDiffusion text-to-image model on Kubernetes#
Note: The Python files for the Ray Serve application and its client are in the ray-project/serve_config_examples repo and the Ray documentation.
Step 1: Create a Kubernetes cluster with GPUs#
Follow aws-eks-gpu-cluster.md or gcp-gke-gpu-cluster.md to create a Kubernetes cluster with 1 CPU node and 1 GPU node.
Step 2: Install KubeRay operator#
Follow this document to install the latest stable KubeRay operator via Helm repository.
Please note that the YAML file in this example uses serveConfigV2
, which is supported starting from KubeRay v0.6.0.
Step 3: Install a RayService#
# Step 3.1: Download `ray-service.stable-diffusion.yaml`
curl -LO https://raw.githubusercontent.com/ray-project/kuberay/v1.2.2/ray-operator/config/samples/ray-service.stable-diffusion.yaml
# Step 3.2: Create a RayService
kubectl apply -f ray-service.stable-diffusion.yaml
This RayService configuration contains some important settings:
The
tolerations
for workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we setnvidia.com/gpu: 1
in the Pod’s resource configurations.# Please add the following taints to the GPU node. tolerations: - key: "ray.io/node-type" operator: "Equal" value: "worker" effect: "NoSchedule"
It includes
diffusers
inruntime_env
since this package is not included by default in theray-ml
image.
Step 4: Forward the port of Serve#
First get the service name from this command.
kubectl get services
Then, port forward to the serve.
kubectl port-forward svc/stable-diffusion-serve-svc 8000
Note that the RayService’s Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
Step 5: Send a request to the text-to-image model#
# Step 5.1: Download `stable_diffusion_req.py`
curl -LO https://raw.githubusercontent.com/ray-project/serve_config_examples/master/stable_diffusion/stable_diffusion_req.py
# Step 5.2: Set your `prompt` in `stable_diffusion_req.py`.
# Step 5.3: Send a request to the Stable Diffusion model.
python stable_diffusion_req.py
# Check output.png
You can refer to the document “Serving a Stable Diffusion Model” for an example output image.