Serve CLI#

serve#

CLI for managing Serve instances on a Ray cluster.

serve [OPTIONS] COMMAND [ARGS]...

build#

Imports the ClassNode(s) or FunctionNode(s) at IMPORT_PATH(S) and generates a structured config for it. If the flag –multi-app is set, accepts multiple ClassNode/FunctionNodes and generates a multi-application config. Config outputted from this command can be used by serve deploy or the REST API.

serve build [OPTIONS] IMPORT_PATHS...

Options

-d, --app-dir <app_dir>#

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-k, --kubernetes_format#

Print Serve config in Kubernetes format.

-o, --output-path <output_path>#

Local path where the output config will be written in YAML format. If not provided, the config will be printed to STDOUT.

-m, --multi-app#

Generate a multi-application config from multiple targets.

Arguments

IMPORT_PATHS#

Required argument(s)

config#

Gets the current config(s) of Serve application(s) on the cluster.

serve config [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-n, --name <name>#

Name of an application. Only applies to multi-application mode. If set, this will only fetch the config for the specified application.

deploy#

This supports both configs of the format ServeApplicationSchema, which deploys a single application, as well as ServeDeploySchema, which deploys multiple applications.

This call is async; a successful response only indicates that the request was sent to the Ray cluster successfully. It does not mean the the deployments have been deployed/updated.

Existing deployments with no code changes will not be redeployed.

Use serve config to fetch the current config(s) and serve status to check the status of the application(s) and deployments after deploying.

serve deploy [OPTIONS] CONFIG_FILE_NAME

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

Arguments

CONFIG_FILE_NAME#

Required argument

run#

Runs the Serve application from the specified import path (e.g. my_script:my_bound_deployment) or application(s) from a YAML config.

If using a YAML config, existing deployments with no code changes in an application will not be redeployed.

Any import path must lead to a FunctionNode or ClassNode object. By default, this will block and periodically log status. If you Ctrl-C the command, it will tear down the app.

serve run [OPTIONS] CONFIG_OR_IMPORT_PATH

Options

--runtime-env <runtime_env>#

Path to a local YAML file containing a runtime_env definition. This will be passed to ray.init() as the default for deployments.

--runtime-env-json <runtime_env_json>#

JSON-serialized runtime_env dictionary. This will be passed to ray.init() as the default for deployments.

--working-dir <working_dir>#

Directory containing files that your application(s) will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). This overrides the working_dir in –runtime-env if both are specified. This will be passed to ray.init() as the default for deployments.

-d, --app-dir <app_dir>#

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-a, --address <address>#

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

-h, --host <host>#

Host for HTTP server to listen on. Defaults to 127.0.0.1.

-p, --port <port>#

Port for HTTP servers to listen on. Defaults to 8000.

--blocking, --non-blocking#

Whether or not this command should be blocking. If blocking, it will loop and log status until Ctrl-C’d, then clean up the app.

--gradio#

Whether to enable gradio visualization of deployment graph. The visualization can only be used with deployment graphs with DAGDriver as the ingress deployment.

Arguments

CONFIG_OR_IMPORT_PATH#

Required argument

shutdown#

Deletes the Serve app.

serve shutdown [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-y, --yes#

Bypass confirmation prompt.

start#

Start a detached Serve instance on the Ray cluster.

serve start [OPTIONS]

Options

-a, --address <address>#

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

--http-host <http_host>#

Host for HTTP servers to listen on. Defaults to 127.0.0.1.

--http-port <http_port>#

Port for HTTP servers to listen on. Defaults to 8000.

--http-location <http_location>#

Location of the HTTP servers. Defaults to HeadOnly.

Options

DeploymentMode.NoServer | DeploymentMode.HeadOnly | DeploymentMode.EveryNode | DeploymentMode.FixedNumber

status#

Prints status information about all applications on the cluster.

An application may be:

  • NOT_STARTED: the application does not exist.

  • DEPLOYING: the deployments in the application are still deploying and haven’t reached the target number of replicas.

  • RUNNING: all deployments are healthy.

  • DEPLOY_FAILED: the application failed to deploy or reach a running state.

  • DELETING: the application is being deleted, and the deployments in the application are being teared down.

The deployments within each application may be:

  • HEALTHY: all replicas are acting normally and passing their health checks.

  • UNHEALTHY: at least one replica is not acting normally and may not be passing its health check.

  • UPDATING: the deployment is updating.

serve status [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-n, --name <name>#

Name of an application. If set, this will display only the status of the specified application.