Serve CLI#

serve#

CLI for managing Serve instances on a Ray cluster.

serve [OPTIONS] COMMAND [ARGS]...

build#

Imports the ClassNode or FunctionNode at IMPORT_PATH and generates a structured config for it that can be used by serve deploy or the REST API.

serve build [OPTIONS] IMPORT_PATH

Options

-d, --app-dir <app_dir>#

Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.

-k, --kubernetes_format#

Print Serve config in Kubernetes format.

-o, --output-path <output_path>#

Local path where the output config will be written in YAML format. If not provided, the config will be printed to STDOUT.

Arguments

IMPORT_PATH#

Required argument

config#

Get the current config of the running Serve app.

serve config [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

deploy#

Deploys deployment(s) from a YAML config file.

This call is async; a successful response only indicates that the request was sent to the Ray cluster successfully. It does not mean the the deployments have been deployed/updated.

Existing deployments with no code changes will not be redeployed.

Use serve config to fetch the current config and serve status to check the status of the deployments after deploying.

serve deploy [OPTIONS] CONFIG_FILE_NAME

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

Arguments

CONFIG_FILE_NAME#

Required argument

run#

Runs a Serve app (specified in config_or_import_path) on a cluster as a Ray Job. config_or_import_path is either a filepath to a YAML config file on the Ray Cluster, or an import path on the Ray Cluster for a deployment node of the pattern containing_module:deployment_node.

If using a YAML config, existing deployments with no code changes will not be redeployed.

Any import path, whether directly specified as the command argument or inside a config file, must lead to a FunctionNode or ClassNode object.

By default, this command will block and periodically log status. If you Ctrl-C the command, it will tear down the app.

serve run [OPTIONS] CONFIG_OR_IMPORT_PATH

Options

--runtime-env <runtime_env>#

Path to a local YAML file containing a runtime_env definition. This will be passed to Ray Jobs as the default for deployments.

--runtime-env-json <runtime_env_json>#

JSON-serialized runtime_env dictionary. This will be passed to Ray Jobs as the default for deployments.

--working-dir <working_dir>#

Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). This overrides the working_dir in –runtime-env if both are specified. This will be passed to Ray Jobs as the default for deployments.

-d, --app-dir <app_dir>#

Directory on the Ray Cluster in which to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, i.e. a deployment node app_node in working_directory/main.py on the Ray Cluster can be run using main:app_node. Not relevant if you’re importing from an installed module.

-a, --address <address>#

Address of the Ray Cluster to run the Serve app on. If no address is specified, a local Ray Cluster will be started. Can also be specified using the RAY_ADDRESS environment variable.

-h, --host <host>#

Host for HTTP server to listen on. Defaults to 127.0.0.1.

-p, --port <port>#

Port for HTTP servers to listen on. Defaults to 8000.

--blocking, --non-blocking#

Whether or not this command should be blocking. If blocking, it will loop and log status until Ctrl-C’d, then clean up the app.

--gradio#

Whether to enable gradio visualization of deployment graph. The visualization can only be used with deployment graphs with DAGDriver as the ingress deployment.

Arguments

CONFIG_OR_IMPORT_PATH#

Required argument

shutdown#

Deletes the Serve app.

serve shutdown [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.

-y, --yes#

Bypass confirmation prompt.

start#

Start a detached Serve instance on the Ray cluster.

serve start [OPTIONS]

Options

-a, --address <address>#

Address of the Ray Cluster to run the Serve app on. If no address is specified, a local Ray Cluster will be started. Can also be specified using the RAY_ADDRESS environment variable.

--http-host <http_host>#

Host for HTTP servers to listen on. Defaults to 127.0.0.1.

--http-port <http_port>#

Port for HTTP servers to listen on. Defaults to 8000.

--http-location <http_location>#

Location of the HTTP servers. Defaults to HeadOnly.

Options

DeploymentMode.NoServer | DeploymentMode.HeadOnly | DeploymentMode.EveryNode | DeploymentMode.FixedNumber

status#

Prints status information about all deployments in the Serve app.

Deployments may be:

  • HEALTHY: all replicas are acting normally and passing their health checks.

  • UNHEALTHY: at least one replica is not acting normally and may not be passing its health check.

  • UPDATING: the deployment is updating.

serve status [OPTIONS]

Options

-a, --address <address>#

Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.