ray.serve.handle.DeploymentHandle#

class ray.serve.handle.DeploymentHandle[source]#

A handle used to make requests to a deployment at runtime.

This is primarily used to compose multiple deployments within a single application. It can also be used to make calls to the ingress deployment of an application (e.g., for programmatic testing).

Example:

import ray
from ray import serve
from ray.serve.handle import DeploymentHandle, DeploymentResponse

@serve.deployment
class Downstream:
    def say_hi(self, message: str):
        return f"Hello {message}!"
        self._message = message

@serve.deployment
class Ingress:
    def __init__(self, handle: DeploymentHandle):
        self._downstream_handle = handle

    async def __call__(self, name: str) -> str:
        response = self._handle.say_hi.remote(name)
        return await response

app = Ingress.bind(Downstream.bind())
handle: DeploymentHandle = serve.run(app)
response = handle.remote("world")
assert response.result() == "Hello world!"

PublicAPI (beta): This API is in beta and may change before becoming stable.

options(*, method_name: str | DEFAULT = DEFAULT.VALUE, multiplexed_model_id: str | DEFAULT = DEFAULT.VALUE, stream: bool | DEFAULT = DEFAULT.VALUE, use_new_handle_api: bool | DEFAULT = DEFAULT.VALUE, _prefer_local_routing: bool | DEFAULT = DEFAULT.VALUE, _source: bool | DEFAULT = DEFAULT.VALUE) DeploymentHandle[source]#

Set options for this handle and return an updated copy of it.

Example:

response: DeploymentResponse = handle.options(
    method_name="other_method",
    multiplexed_model_id="model:v1",
).remote()
remote(*args, **kwargs) DeploymentResponse | DeploymentResponseGenerator[source]#

Issue a remote call to a method of the deployment.

By default, the result is a DeploymentResponse that can be awaited to fetch the result of the call or passed to another .remote() call to compose multiple deployments.

If handle.options(stream=True) is set and a generator method is called, this returns a DeploymentResponseGenerator instead.

Example:

# Fetch the result directly.
response = handle.remote()
result = await response

# Pass the result to another handle call.
composed_response = handle2.remote(handle1.remote())
composed_result = await composed_response
Parameters:
  • *args – Positional arguments to be serialized and passed to the remote method call.

  • **kwargs – Keyword arguments to be serialized and passed to the remote method call.