ray.serve.config.AutoscalingConfig#

pydantic model ray.serve.config.AutoscalingConfig[source]#

Config for the Serve Autoscaler.

Show JSON schema
{
   "title": "AutoscalingConfig",
   "description": "Config for the Serve Autoscaler.",
   "type": "object",
   "properties": {
      "min_replicas": {
         "title": "Min Replicas",
         "default": 1,
         "minimum": 0,
         "type": "integer"
      },
      "initial_replicas": {
         "title": "Initial Replicas",
         "minimum": 0,
         "type": "integer"
      },
      "max_replicas": {
         "title": "Max Replicas",
         "default": 1,
         "exclusiveMinimum": 0,
         "type": "integer"
      },
      "target_num_ongoing_requests_per_replica": {
         "title": "Target Num Ongoing Requests Per Replica",
         "default": 1.0,
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "metrics_interval_s": {
         "title": "Metrics Interval S",
         "default": 10.0,
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "look_back_period_s": {
         "title": "Look Back Period S",
         "default": 30.0,
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "smoothing_factor": {
         "title": "Smoothing Factor",
         "default": 1.0,
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "upscale_smoothing_factor": {
         "title": "Upscale Smoothing Factor",
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "downscale_smoothing_factor": {
         "title": "Downscale Smoothing Factor",
         "exclusiveMinimum": 0,
         "type": "number"
      },
      "downscale_delay_s": {
         "title": "Downscale Delay S",
         "default": 600.0,
         "minimum": 0,
         "type": "number"
      },
      "upscale_delay_s": {
         "title": "Upscale Delay S",
         "default": 30.0,
         "minimum": 0,
         "type": "number"
      }
   }
}

Fields:
Validators:
field downscale_delay_s: NonNegativeFloat = 600.0#
Constraints:
  • minimum = 0

field downscale_smoothing_factor: PositiveFloat | None = None#
Constraints:
  • exclusiveMinimum = 0

field initial_replicas: NonNegativeInt | None = None#
Constraints:
  • minimum = 0

field look_back_period_s: PositiveFloat = 30.0#
Constraints:
  • exclusiveMinimum = 0

field max_replicas: PositiveInt = 1#
Constraints:
  • exclusiveMinimum = 0

Validated by:
field metrics_interval_s: PositiveFloat = 10.0#
Constraints:
  • exclusiveMinimum = 0

field min_replicas: NonNegativeInt = 1#
Constraints:
  • minimum = 0

field smoothing_factor: PositiveFloat = 1.0#
Constraints:
  • exclusiveMinimum = 0

field target_num_ongoing_requests_per_replica: PositiveFloat = 1.0#
Constraints:
  • exclusiveMinimum = 0

field upscale_delay_s: NonNegativeFloat = 30.0#
Constraints:
  • minimum = 0

field upscale_smoothing_factor: PositiveFloat | None = None#
Constraints:
  • exclusiveMinimum = 0

classmethod default()[source]#
get_downscale_smoothing_factor() PositiveFloat[source]#
get_policy() Callable[source]#

Deserialize policy from cloudpickled bytes.

get_upscale_smoothing_factor() PositiveFloat[source]#
validator replicas_settings_valid  »  max_replicas[source]#
serialize_policy() None[source]#

Serialize policy with cloudpickle.

Import the policy if it’s passed in as a string import path. Then cloudpickle the policy and set serialized_policy_def if it’s empty.