class ray.serve.handle.DeploymentResponseGenerator[source]#

A future-like object wrapping the result of a streaming deployment handle call.

This is returned when using handle.options(stream=True) and calling a generator deployment method.

DeploymentResponseGenerator is both a synchronous and asynchronous iterator.

When iterating over results from inside a deployment, async for should be used to avoid blocking the asyncio event loop.

When iterating over results from outside a deployment, use a standard for loop.


from typing import AsyncGenerator, Generator

from ray import serve
from ray.serve.handle import DeploymentHandle

class Streamer:
    def generate_numbers(self, limit: int) -> Generator[int]:
        for i in range(limit):
            yield i

class Caller:
    def __init__(self, handle: DeploymentHandle):
        # Set `stream=True` on the handle to enable streaming calls.
        self._streaming_handle = handle.options(stream=True)

async def __call__(self, limit: int) -> AsyncIterator[int]:
    gen: DeploymentResponseGenerator = (

    # Inside a deployment: use `async for` to enable concurrency.
    async for i in gen:
        yield i

app = Caller.bind(Streamer.bind())
handle: DeploymentHandle = serve.run(app)

# Outside a deployment: use a standard `for` loop.
gen: DeploymentResponseGenerator = handle.options(stream=True).remote(10)
assert [i for i in gen] == list(range(10))

A DeploymentResponseGenerator cannot currently be passed to another DeploymentHandle call.

PublicAPI (beta): This API is in beta and may change before becoming stable.