ray.serve.llm.LLMServer.collective_rpc#

async LLMServer.collective_rpc(method: str, timeout: float | None = None, args: tuple = (), kwargs: dict | None = None) → list[source]#

Execute a collective RPC call on all workers.

This is used for RLHF workflows where a trainer needs to execute methods on all TP/PP workers (e.g., for weight synchronization).

Parameters:

Returns:

A list containing the results from each worker.