ray.serve.llm.LLMServer.record_routing_stats#

async LLMServer.record_routing_stats() Dict[str, Any][source]#

Serve request-router hook, polled by the controller.

Surfaces this replica’s routing stats (the engine’s KV-events endpoint for KV-aware routing); the deployment’s KVRouterActor reads them off the LongPoll replica snapshot to register the worker.