ray.serve.config.AutoscalingConfig#

pydantic model ray.serve.config.AutoscalingConfig[source]#

Config for the Serve Autoscaler.

field aggregation_function: str | AggregationFunction = AggregationFunction.MEAN#

Function used to aggregate metrics across a time window.

field downscale_delay_s: float = 600.0#

How long to wait before scaling down replicas to a value greater than 0.

Constraints:
  • ge = 0

field downscale_smoothing_factor: float | None = None#

[DEPRECATED] Please use downscaling_factor instead.

field downscale_to_zero_delay_s: float | None = None#

How long to wait before scaling down replicas from 1 to 0. If not set, the value of downscale_delay_s will be used.

field downscaling_factor: float | None = None#

Multiplicative “gain” factor to limit downscaling decisions.

field initial_replicas: int | None = None#
field look_back_period_s: float = 30.0#

Time window to average over for metrics.

Constraints:
  • gt = 0

field max_replicas: int = 1#
Constraints:
  • gt = 0

field metrics_interval_s: float = 10.0#

[DEPRECATED] How often to scrape for metrics. Will be replaced by the environment variables RAY_SERVE_REPLICA_AUTOSCALING_METRIC_PUSH_INTERVAL_S and RAY_SERVE_HANDLE_AUTOSCALING_METRIC_PUSH_INTERVAL_S in a future release.

Constraints:
  • gt = 0

field min_replicas: int = 1#
Constraints:
  • ge = 0

field policy: AutoscalingPolicy [Optional]#

The autoscaling policy for the deployment. This option is experimental.

field smoothing_factor: float = 1.0#

[DEPRECATED] Smoothing factor for autoscaling decisions.

Constraints:
  • gt = 0

field target_ongoing_requests: float | None = 2#
field upscale_delay_s: float = 30.0#

How long to wait before scaling up replicas.

Constraints:
  • ge = 0

field upscale_smoothing_factor: float | None = None#

[DEPRECATED] Please use upscaling_factor instead.

field upscaling_factor: float | None = None#

Multiplicative “gain” factor to limit upscaling decisions.

validator aggregation_function_valid  »  aggregation_function[source]#
classmethod default()[source]#
get_downscaling_factor() float[source]#
get_target_ongoing_requests() float[source]#
get_upscaling_factor() float[source]#
validator look_back_period_s_valid  »  look_back_period_s[source]#
validator metrics_interval_s_deprecation_warning  »  metrics_interval_s[source]#
validator replicas_settings_valid  »  all fields[source]#