ray.serve.config.RequestRouterConfig#
- class ray.serve.config.RequestRouterConfig(*, request_router_class: str | Callable = 'ray.serve._private.request_router:PowerOfTwoChoicesRequestRouter', request_router_kwargs: Dict[str, Any] = None, request_routing_stats_period_s: PositiveFloat = 10, request_routing_stats_timeout_s: PositiveFloat = 30)[source]#
Bases:
BaseModel
Config for the Serve request router.
This class configures how Ray Serve routes requests to deployment replicas. The router is responsible for selecting which replica should handle each incoming request based on the configured routing policy. You can customize the routing behavior by specifying a custom request router class and providing configuration parameters.
The router also manages periodic health checks and scheduling statistics collection from replicas to make informed routing decisions.
Example
from ray.serve.config import RequestRouterConfig, DeploymentConfig from ray import serve # Use default router with custom stats collection interval request_router_config = RequestRouterConfig( request_routing_stats_period_s=5.0, request_routing_stats_timeout_s=15.0 ) # Use custom router class request_router_config = RequestRouterConfig( request_router_class="ray.llm._internal.serve.request_router.prefix_aware.prefix_aware_router.PrefixAwarePow2ReplicaRouter", request_router_kwargs={"imbalanced_threshold": 20} ) deployment_config = DeploymentConfig( request_router_config=request_router_config ) deployment = serve.deploy( "my_deployment", deployment_config=deployment_config )
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Methods
Initialize RequestRouterConfig with the given parameters.
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data.
Duplicate a model, optionally choose which fields to include, exclude and change.
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
Deserialize the request router from cloudpickled bytes.
Generate a JSON representation of the model,
include
andexclude
arguments as perdict()
.Try to update ForwardRefs on fields based on this Model, globalns and localns.
Attributes