ray.serve.request_router.RequestRouter.choose_replicas#

abstract async RequestRouter.choose_replicas(candidate_replicas: List[RunningReplica], pending_request: PendingRequest | None = None) List[List[RunningReplica]][source]#

Chooses a subset of candidate replicas from available replicas.

This is the main function each request router should implement to decide which replica to send the request to. This is one iteration of replica selection.

Parameters:
  • candidate_replicas – A list of candidate replicas to be considered in the policy.

  • pending_request – The request to be routed. This is used to determine which replicas are eligible for routing.

Returns:

A list of lists of replicas, where each inner list represents a rank of replicas. The first rank is the most preferred and the last rank is the least preferred.