ray.serve.Deployment#

class ray.serve.Deployment[source]#

Class (or function) decorated with the @serve.deployment decorator.

This is run on a number of replica actors. Requests to those replicas call this class.

One or more deployments can be composed together into an Application which is then run via serve.run or a config file.

Example:

@serve.deployment
class MyDeployment:
    def __init__(self, name: str):
        self._name = name

    def __call__(self, request):
        return "Hello world!"

    app = MyDeployment.bind()
    # Run via `serve.run` or the `serve run` CLI command.
    serve.run(app)
property name: str#

Unique name of this deployment.

property func_or_class: Union[Callable, str]#

Underlying class or function that this deployment wraps.

property num_replicas: int#

Current target number of replicas.

property user_config: Any#

Current dynamic user-provided config options.

property max_concurrent_queries: int#

Current max outstanding queries from each handle.

property route_prefix: Optional[str]#

HTTP route prefix that this deployment is exposed under.

property ray_actor_options: Optional[Dict]#

Actor options such as resources required for each replica.

bind(*args, **kwargs) ray.serve.deployment.Application[source]#

Bind the arguments to the deployment and return an Application.

The returned Application can be deployed using serve.run (or via config file) or bound to another deployment for composition.

options(func_or_class: Optional[Callable] = None, name: Union[ray.serve._private.utils.DEFAULT, str] = DEFAULT.VALUE, version: Union[ray.serve._private.utils.DEFAULT, str] = DEFAULT.VALUE, num_replicas: Optional[Union[ray.serve._private.utils.DEFAULT, int]] = DEFAULT.VALUE, init_args: Union[ray.serve._private.utils.DEFAULT, Tuple[Any]] = DEFAULT.VALUE, init_kwargs: Union[ray.serve._private.utils.DEFAULT, Dict[Any, Any]] = DEFAULT.VALUE, route_prefix: Optional[Union[ray.serve._private.utils.DEFAULT, str]] = DEFAULT.VALUE, ray_actor_options: Optional[Union[ray.serve._private.utils.DEFAULT, Dict]] = DEFAULT.VALUE, placement_group_bundles: Optional[List[Dict[str, float]]] = DEFAULT.VALUE, placement_group_strategy: Optional[str] = DEFAULT.VALUE, max_replicas_per_node: Optional[int] = DEFAULT.VALUE, user_config: Optional[Union[ray.serve._private.utils.DEFAULT, Any]] = DEFAULT.VALUE, max_concurrent_queries: Union[ray.serve._private.utils.DEFAULT, int] = DEFAULT.VALUE, autoscaling_config: Optional[Union[ray.serve._private.utils.DEFAULT, Dict, ray.serve.config.AutoscalingConfig]] = DEFAULT.VALUE, graceful_shutdown_wait_loop_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, graceful_shutdown_timeout_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, health_check_period_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, health_check_timeout_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, _internal: bool = False) ray.serve.deployment.Deployment[source]#

Return a copy of this deployment with updated options.

Only those options passed in will be updated, all others will remain unchanged from the existing deployment.

Refer to the @serve.deployment decorator docs for available arguments.

set_options(func_or_class: Optional[Callable] = None, name: Union[ray.serve._private.utils.DEFAULT, str] = DEFAULT.VALUE, version: Union[ray.serve._private.utils.DEFAULT, str] = DEFAULT.VALUE, num_replicas: Optional[Union[ray.serve._private.utils.DEFAULT, int]] = DEFAULT.VALUE, init_args: Union[ray.serve._private.utils.DEFAULT, Tuple[Any]] = DEFAULT.VALUE, init_kwargs: Union[ray.serve._private.utils.DEFAULT, Dict[Any, Any]] = DEFAULT.VALUE, route_prefix: Optional[Union[ray.serve._private.utils.DEFAULT, str]] = DEFAULT.VALUE, ray_actor_options: Optional[Union[ray.serve._private.utils.DEFAULT, Dict]] = DEFAULT.VALUE, user_config: Optional[Union[ray.serve._private.utils.DEFAULT, Any]] = DEFAULT.VALUE, max_concurrent_queries: Union[ray.serve._private.utils.DEFAULT, int] = DEFAULT.VALUE, autoscaling_config: Optional[Union[ray.serve._private.utils.DEFAULT, Dict, ray.serve.config.AutoscalingConfig]] = DEFAULT.VALUE, graceful_shutdown_wait_loop_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, graceful_shutdown_timeout_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, health_check_period_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, health_check_timeout_s: Union[ray.serve._private.utils.DEFAULT, float] = DEFAULT.VALUE, _internal: bool = False) None[source]#

Overwrite this deployment’s options in-place.

Only those options passed in will be updated, all others will remain unchanged.

Refer to the @serve.deployment decorator docstring for all non-private arguments.

Warning

DEPRECATED: This API is deprecated and may be removed in future Ray releases. This was intended for use with the serve.build Python API (which has been deprecated). Use options() instead.