Handling Dependencies

Ray Serve supports serving deployments with different (possibly conflicting) Python dependencies. For example, you can simultaneously serve one deployment that uses legacy Tensorflow 1 and another that uses Tensorflow 2.

This is supported on Mac OS and Linux using Ray’s Runtime environments feature. As with all other Ray actor options, pass the runtime environment in via ray_actor_options in your deployment. Be sure to first run pip install "ray[default]" to ensure the Runtime Environments feature is installed.

Example:

import requests
from ray import serve

serve.start()


@serve.deployment
def requests_version(request):
    return requests.__version__


requests_version.options(
    name="25",
    ray_actor_options={"runtime_env": {"pip": ["requests==2.25.1"]}},
).deploy()
requests_version.options(
    name="26",
    ray_actor_options={"runtime_env": {"pip": ["requests==2.26.0"]}},
).deploy()

assert requests.get("http://127.0.0.1:8000/25").text == "2.25.1"
assert requests.get("http://127.0.0.1:8000/26").text == "2.26.0"

Tip

Avoid dynamically installing packages that install from source: these can be slow and use up all resources while installing, leading to problems with the Ray cluster. Consider precompiling such packages in a private repository or Docker image.

The dependencies required in the deployment may be different than the dependencies installed in the driver program (the one running Serve API calls). In this case, you should use a delayed import within the class to avoid importing unavailable packages in the driver. This applies even when not using runtime environments.

Example:

from ray import serve

serve.start()


@serve.deployment
class MyDeployment:
    def __call__(self, model_path):
        from my_module import my_model

        self.model = my_model.load(model_path)


MyDeployment.deploy("/model_path.pkl")