ray.serve.batch#

ray.serve.batch(_sync_func: Callable[[List[T]], List[R]], /) → Callable[[T], R][source]#

ray.serve.batch(_async_func: Callable[[List[T]], Coroutine[Any, Any, List[R]]], /) → Callable[[T], Coroutine[Any, Any, R]]

ray.serve.batch(_sync_meth: _SyncBatchingMethod[SelfType, T, R], /) → Callable[[SelfType, T], R]

ray.serve.batch(_async_meth: _AsyncBatchingMethod[SelfType, T, R], /) → Callable[[SelfType, T], Coroutine[Any, Any, R]]

ray.serve.batch(_: Literal[None] = None, /, max_batch_size: int = 10, batch_wait_timeout_s: float = 0.01, max_concurrent_batches: int = 1) → _BatchDecorator

Converts a function to asynchronously handle batches.

The function can be a standalone function or a class method. In both cases, the function must be async def and take a list of objects as its sole argument and return a list of the same length as a result.

When invoked, the caller passes a single object. These will be batched and executed asynchronously once there is a batch of max_batch_size or batch_wait_timeout_s has elapsed, whichever occurs first.

max_batch_size and batch_wait_timeout_s can be updated using setter methods from the batch_handler (set_max_batch_size and set_batch_wait_timeout_s).

Example:

from ray import serve
from starlette.requests import Request

@serve.deployment
class BatchedDeployment:
    @serve.batch(max_batch_size=10, batch_wait_timeout_s=0.1)
    async def batch_handler(self, requests: List[Request]) -> List[str]:
        response_batch = []
        for r in requests:
            name = (await requests.json())["name"]
            response_batch.append(f"Hello {name}!")

        return response_batch

    def update_batch_params(self, max_batch_size, batch_wait_timeout_s):
        self.batch_handler.set_max_batch_size(max_batch_size)
        self.batch_handler.set_batch_wait_timeout_s(batch_wait_timeout_s)

    async def __call__(self, request: Request):
        return await self.batch_handler(request)

app = BatchedDeployment.bind()

Parameters:

max_batch_size – the maximum batch size that will be executed in one call to the underlying function.
batch_wait_timeout_s – the maximum duration to wait for max_batch_size elements before running the current batch.
max_concurrent_batches – the maximum number of batches that can be executed concurrently. If the number of concurrent batches exceeds this limit, the batch handler will wait for a batch to complete before sending the next batch to the underlying function.