ray.serve.request_router.LocalityMixin.apply_locality_routing#

LocalityMixin.apply_locality_routing(pending_request: PendingRequest | None = None) Set[ReplicaID][source]#

Apply locality routing to the pending request.

When the reqeust is None, return all replicas. Each call will try to route the request to replicas in the priority of first on the same node, then in the same availability zone, and finally all replicas.

Parameters:

pending_request – The pending request to be routed.

Returns:

A set of replica IDs that are candidates based on the locality policy.