ray.serve.Deployment#
- class ray.serve.Deployment[source]#
Class (or function) decorated with the
@serve.deployment
decorator.This is run on a number of replica actors. Requests to those replicas call this class.
One or more deployments can be composed together into an
Application
which is then run viaserve.run
or a config file.Example:
@serve.deployment class MyDeployment: def __init__(self, name: str): self._name = name def __call__(self, request): return "Hello world!" app = MyDeployment.bind() # Run via `serve.run` or the `serve run` CLI command. serve.run(app)
- property max_queued_requests: int#
Max number of requests that can be queued in each deployment handle.
- bind(*args, **kwargs) Application [source]#
Bind the arguments to the deployment and return an Application.
The returned Application can be deployed using
serve.run
(or via config file) or bound to another deployment for composition.
- options(func_or_class: Callable | None = None, name: DEFAULT | str = DEFAULT.VALUE, version: DEFAULT | str = DEFAULT.VALUE, num_replicas: DEFAULT | int | str | None = DEFAULT.VALUE, route_prefix: DEFAULT | str | None = DEFAULT.VALUE, ray_actor_options: DEFAULT | Dict | None = DEFAULT.VALUE, placement_group_bundles: DEFAULT | List[Dict[str, float]] = DEFAULT.VALUE, placement_group_strategy: DEFAULT | str = DEFAULT.VALUE, max_replicas_per_node: DEFAULT | int = DEFAULT.VALUE, user_config: DEFAULT | Any | None = DEFAULT.VALUE, max_ongoing_requests: DEFAULT | int = DEFAULT.VALUE, max_queued_requests: DEFAULT | int = DEFAULT.VALUE, autoscaling_config: DEFAULT | Dict | AutoscalingConfig | None = DEFAULT.VALUE, graceful_shutdown_wait_loop_s: DEFAULT | float = DEFAULT.VALUE, graceful_shutdown_timeout_s: DEFAULT | float = DEFAULT.VALUE, health_check_period_s: DEFAULT | float = DEFAULT.VALUE, health_check_timeout_s: DEFAULT | float = DEFAULT.VALUE, logging_config: DEFAULT | Dict | LoggingConfig | None = DEFAULT.VALUE, _init_args: DEFAULT | Tuple[Any] = DEFAULT.VALUE, _init_kwargs: DEFAULT | Dict[Any, Any] = DEFAULT.VALUE, _internal: bool = False) Deployment [source]#
Return a copy of this deployment with updated options.
Only those options passed in will be updated, all others will remain unchanged from the existing deployment.
Refer to the
@serve.deployment
decorator docs for available arguments.