ray.serve.Deployment#

class ray.serve.Deployment[source]#

Class (or function) decorated with the @serve.deployment decorator.

This is run on a number of replica actors. Requests to those replicas call this class.

One or more deployments can be composed together into an Application which is then run via serve.run or a config file.

Example:

@serve.deployment
class MyDeployment:
    def __init__(self, name: str):
        self._name = name

    def __call__(self, request):
        return "Hello world!"

app = MyDeployment.bind()
# Run via `serve.run` or the `serve run` CLI command.
serve.run(app)

property name: str#: Unique name of this deployment.

property func_or_class: Callable | str#: Underlying class or function that this deployment wraps.

property num_replicas: int#: Target number of replicas.

property user_config: Any#: Dynamic user-provided config options.

property max_ongoing_requests: int#: Max number of requests a replica can handle at once.

property max_queued_requests: int#: Max number of requests that can be queued in each deployment handle.

property ray_actor_options: Dict | None#: Actor options such as resources required for each replica.

bind(*args, **kwargs) → Application[source]#

Bind the arguments to the deployment and return an Application.

The returned Application can be deployed using serve.run (or via config file) or bound to another deployment for composition.

Return a copy of this deployment with updated options.

Only those options passed in will be updated, all others will remain unchanged from the existing deployment.

Refer to the @serve.deployment decorator docs for available arguments.