Serve CLI

serve

CLI for managing Serve instances on a Ray cluster.

serve [OPTIONS] COMMAND [ARGS]...

build

Imports the ClassNode or FunctionNode at IMPORT_PATH and generates a structured config for it that can be used by serve deploy or the REST API.

serve build [OPTIONS] IMPORT_PATH

Options

-d, --app-dir <app_dir>

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-k, --kubernetes_format

Print Serve config in Kubernetes format.

-o, --output-path <output_path>

Local path where the output config will be written in YAML format. If not provided, the config will be printed to STDOUT.

Arguments

IMPORT_PATH

Required argument

config

Get the current config of the running Serve app.

serve config [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

deploy

Deploys deployment(s) from a YAML config file.

This call is async; a successful response only indicates that the request was sent to the Ray cluster successfully. It does not mean the the deployments have been deployed/updated.

Existing deployments with no code changes will not be redeployed.

Use serve config to fetch the current config and serve status to check the status of the deployments after deploying.

serve deploy [OPTIONS] CONFIG_FILE_NAME

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

Arguments

CONFIG_FILE_NAME

Required argument

run

Runs the Serve app from the specified import path (e.g. my_script:my_bound_deployment) or YAML config.

If using a YAML config, existing deployments with no code changes will not be redeployed.

Any import path must lead to a FunctionNode or ClassNode object. By default, this will block and periodically log status. If you Ctrl-C the command, it will tear down the app.

serve run [OPTIONS] CONFIG_OR_IMPORT_PATH

Options

--runtime-env <runtime_env>

Path to a local YAML file containing a runtime_env definition. This will be passed to ray.init() as the default for deployments.

--runtime-env-json <runtime_env_json>

JSON-serialized runtime_env dictionary. This will be passed to ray.init() as the default for deployments.

--working-dir <working_dir>

Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). This overrides the working_dir in –runtime-env if both are specified. This will be passed to ray.init() as the default for deployments.

-d, --app-dir <app_dir>

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-a, --address <address>

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

-h, --host <host>

Host for HTTP server to listen on. Defaults to 127.0.0.1.

-p, --port <port>

Port for HTTP servers to listen on. Defaults to 8000.

--blocking, --non-blocking

Whether or not this command should be blocking. If blocking, it will loop and log status until Ctrl-C’d, then clean up the app.

--gradio

Whether to enable gradio visualization of deployment graph.

Arguments

CONFIG_OR_IMPORT_PATH

Required argument

shutdown

Deletes the Serve app.

serve shutdown [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-y, --yes

Bypass confirmation prompt.

start

Start a detached Serve instance on the Ray cluster.

serve start [OPTIONS]

Options

-a, --address <address>

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

--http-host <http_host>

Host for HTTP servers to listen on. Defaults to 127.0.0.1.

--http-port <http_port>

Port for HTTP servers to listen on. Defaults to 8000.

--http-location <http_location>

Location of the HTTP servers. Defaults to HeadOnly.

Options

DeploymentMode.NoServer | DeploymentMode.HeadOnly | DeploymentMode.EveryNode | DeploymentMode.FixedNumber

status

Prints status information about all deployments in the Serve app.

Deployments may be:

  • HEALTHY: all replicas are acting normally and passing their health checks.

  • UNHEALTHY: at least one replica is not acting normally and may not be passing its health check.

  • UPDATING: the deployment is updating.

serve status [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.