Serve CLI

serve

CLI for managing Serve instances on a Ray cluster.

serve [OPTIONS] COMMAND [ARGS]...

build

Imports the ClassNode or FunctionNode at IMPORT_PATH and generates a structured config for it that can be used by serve deploy or the REST API.

serve build [OPTIONS] IMPORT_PATH

Options

-d, --app-dir <app_dir>

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-o, --output-path <output_path>

Local path where the output config will be written in YAML format. If not provided, the config will be printed to STDOUT.

Arguments

IMPORT_PATH

Required argument

config

Get the current config of the running Serve app.

serve config [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

deploy

Deploys deployment(s) from a YAML config file.

This call is async; a successful response only indicates that the request was sent to the Ray cluster successfully. It does not mean the the deployments have been deployed/updated.

Use serve config to fetch the current config and serve status to check the status of the deployments after deploying.

serve deploy [OPTIONS] CONFIG_FILE_NAME

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

Arguments

CONFIG_FILE_NAME

Required argument

run

Runs the Serve app from the specified import path or YAML config. Any import path must lead to a FunctionNode or ClassNode object. By default, this will block and periodically log status. If you Ctrl-C the command, it will tear down the app.

serve run [OPTIONS] CONFIG_OR_IMPORT_PATH

Options

--runtime-env <runtime_env>

Path to a local YAML file containing a runtime_env definition. This will be passed to ray.init() as the default for deployments.

--runtime-env-json <runtime_env_json>

JSON-serialized runtime_env dictionary. This will be passed to ray.init() as the default for deployments.

--working-dir <working_dir>

Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). This overrides the working_dir in –runtime-env if both are specified. This will be passed to ray.init() as the default for deployments.

-d, --app-dir <app_dir>

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-a, --address <address>

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

-h, --host <host>

Host for HTTP server to listen on. Defaults to 127.0.0.1.

-p, --port <port>

Port for HTTP servers to listen on. Defaults to 8000.

--blocking, --non-blocking

Whether or not this command should be blocking. If blocking, it will loop and log status until Ctrl-C’d, then clean up the app.

Arguments

CONFIG_OR_IMPORT_PATH

Required argument

shutdown

Deletes the Serve app.

serve shutdown [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-y, --yes

Bypass confirmation prompt.

start

Start a detached Serve instance on the Ray cluster.

serve start [OPTIONS]

Options

-a, --address <address>

Address to use for ray.init(). Can also be specified using the RAY_ADDRESS environment variable.

--http-host <http_host>

Host for HTTP servers to listen on. Defaults to 127.0.0.1.

--http-port <http_port>

Port for HTTP servers to listen on. Defaults to 8000.

--http-location <http_location>

Location of the HTTP servers. Defaults to HeadOnly.

Options

DeploymentMode.NoServer | DeploymentMode.HeadOnly | DeploymentMode.EveryNode | DeploymentMode.FixedNumber

status

Prints status information about all deployments in the Serve app.

Deployments may be:

  • HEALTHY: all replicas are acting normally and passing their health checks.

  • UNHEALTHY: at least one replica is not acting normally and may not be passing its health check.

  • UPDATING: the deployment is updating.

serve status [OPTIONS]

Options

-a, --address <address>

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.