ray.serve.get_multiplexed_model_id#

ray.serve.get_multiplexed_model_id() str[source]#

Get the multiplexed model ID for the current request.

This is used with a function decorated with @serve.multiplexed to retrieve the model ID for the current request.

When called from within a batched function (decorated with @serve.batch), this returns the multiplexed model ID that is common to all requests in the current batch. This works because batches are automatically split by model ID to ensure all requests in a batch target the same model.

import ray
from ray import serve
import requests

# Set the multiplexed model id with the key
# "ray_serve_multiplexed_model_id" in the request
# headers when sending requests to the http proxy.
requests.get("http://localhost:8000",
    headers={"ray_serve_multiplexed_model_id": "model_1"})

# This can also be set when using `DeploymentHandle`.
handle.options(multiplexed_model_id="model_1").remote("blablabla")

# In your deployment code, you can retrieve the model id from
# `get_multiplexed_model_id()`.
@serve.deployment
def my_deployment_function(request):
    assert serve.get_multiplexed_model_id() == "model_1"

PublicAPI (beta): This API is in beta and may change before becoming stable.