class ray.serve.handle.DeploymentHandle[source]#

A handle used to make requests to a deployment at runtime.

This is primarily used to compose multiple deployments within a single application. It can also be used to make calls to the ingress deployment of an application (e.g., for programmatic testing).


import ray
from ray import serve
from ray.serve.handle import DeploymentHandle, DeploymentResponse

class Downstream:
    def say_hi(self, message: str):
        return f"Hello {message}!"
        self._message = message

class Ingress:
    def __init__(self, handle: DeploymentHandle):
        self._downstream_handle = handle

    async def __call__(self, name: str) -> str:
        response = self._handle.say_hi.remote(name)
        return await response

app = Ingress.bind(Downstream.bind())
handle: DeploymentHandle = serve.run(app)
response = handle.remote("world")
assert response.result() == "Hello world!"

PublicAPI (beta): This API is in beta and may change before becoming stable.

options(*, method_name: Union[str, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE, multiplexed_model_id: Union[str, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE, stream: Union[bool, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE, use_new_handle_api: Union[bool, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE, _prefer_local_routing: Union[bool, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE, _router_cls: Union[str, ray.serve._private.utils.DEFAULT] = DEFAULT.VALUE) ray.serve.handle.DeploymentHandle[source]#

Set options for this handle and return an updated copy of it.


response: DeploymentResponse = handle.options(
remote(*args, **kwargs) Union[ray.serve.handle.DeploymentResponse, ray.serve.handle.DeploymentResponseGenerator][source]#

Issue a remote call to a method of the deployment.

By default, the result is a DeploymentResponse that can be awaited to fetch the result of the call or passed to another remote() call to compose multiple deployments.

If handle.options(stream=True) is set and a generator method is called, this returns a DeploymentResponseGenerator instead.


# Fetch the result directly.
response = handle.remote()
result = await response

# Pass the result to another handle call.
composed_response = handle2.remote(handle1.remote())
composed_result = await composed_response
  • *args – Positional arguments to be serialized and passed to the remote method call.

  • **kwargs – Keyword arguments to be serialized and passed to the remote method call.