Serve CLI
Contents
Serve CLI#
serve#
CLI for managing Serve instances on a Ray cluster.
serve [OPTIONS] COMMAND [ARGS]...
build#
Imports the ClassNode or FunctionNode at IMPORT_PATH and generates a structured config for it that can be used by serve deploy
or the REST API.
serve build [OPTIONS] IMPORT_PATH
Options
- -d, --app-dir <app_dir>#
Local directory to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, meaning that an object in ./main.py can be imported as ‘main.object’. Not relevant if you’re importing from an installed module.
- -k, --kubernetes_format#
Print Serve config in Kubernetes format.
- -o, --output-path <output_path>#
Local path where the output config will be written in YAML format. If not provided, the config will be printed to STDOUT.
Arguments
- IMPORT_PATH#
Required argument
config#
Get the current config of the running Serve app.
serve config [OPTIONS]
Options
- -a, --address <address>#
Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.
deploy#
Deploys deployment(s) from a YAML config file.
This call is async; a successful response only indicates that the request was sent to the Ray cluster successfully. It does not mean the the deployments have been deployed/updated.
Existing deployments with no code changes will not be redeployed.
Use serve config
to fetch the current config and serve status
to check the status of the deployments after deploying.
serve deploy [OPTIONS] CONFIG_FILE_NAME
Options
- -a, --address <address>#
Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.
Arguments
- CONFIG_FILE_NAME#
Required argument
run#
Runs a Serve app (specified in config_or_import_path) on a cluster as a Ray Job. config_or_import_path is either a filepath to a YAML config file on the Ray Cluster, or an import path on the Ray Cluster for a deployment node of the pattern containing_module:deployment_node.
If using a YAML config, existing deployments with no code changes will not be redeployed.
Any import path, whether directly specified as the command argument or inside a config file, must lead to a FunctionNode or ClassNode object.
By default, this command will block and periodically log status. If you Ctrl-C the command, it will tear down the app.
serve run [OPTIONS] CONFIG_OR_IMPORT_PATH
Options
- --runtime-env <runtime_env>#
Path to a local YAML file containing a runtime_env definition. This will be passed to Ray Jobs as the default for deployments.
- --runtime-env-json <runtime_env_json>#
JSON-serialized runtime_env dictionary. This will be passed to Ray Jobs as the default for deployments.
- --working-dir <working_dir>#
Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). This overrides the working_dir in –runtime-env if both are specified. This will be passed to Ray Jobs as the default for deployments.
- -d, --app-dir <app_dir>#
Directory on the Ray Cluster in which to look for the IMPORT_PATH (will be inserted into PYTHONPATH). Defaults to ‘.’, i.e. a deployment node
app_node
in working_directory/main.py on the Ray Cluster can be run usingmain:app_node
. Not relevant if you’re importing from an installed module.
- -a, --address <address>#
Address of the Ray Cluster to run the Serve app on. If no address is specified, a local Ray Cluster will be started. Can also be specified using the RAY_ADDRESS environment variable.
- -h, --host <host>#
Host for HTTP server to listen on. Defaults to 127.0.0.1.
- -p, --port <port>#
Port for HTTP servers to listen on. Defaults to 8000.
- --blocking, --non-blocking#
Whether or not this command should be blocking. If blocking, it will loop and log status until Ctrl-C’d, then clean up the app.
- --gradio#
Whether to enable gradio visualization of deployment graph. The visualization can only be used with deployment graphs with DAGDriver as the ingress deployment.
Arguments
- CONFIG_OR_IMPORT_PATH#
Required argument
shutdown#
Deletes the Serve app.
serve shutdown [OPTIONS]
Options
- -a, --address <address>#
Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.
- -y, --yes#
Bypass confirmation prompt.
start#
Start a detached Serve instance on the Ray cluster.
serve start [OPTIONS]
Options
- -a, --address <address>#
Address of the Ray Cluster to run the Serve app on. If no address is specified, a local Ray Cluster will be started. Can also be specified using the RAY_ADDRESS environment variable.
- --http-host <http_host>#
Host for HTTP servers to listen on. Defaults to 127.0.0.1.
- --http-port <http_port>#
Port for HTTP servers to listen on. Defaults to 8000.
- --http-location <http_location>#
Location of the HTTP servers. Defaults to HeadOnly.
- Options
DeploymentMode.NoServer | DeploymentMode.HeadOnly | DeploymentMode.EveryNode | DeploymentMode.FixedNumber
status#
Prints status information about all deployments in the Serve app.
Deployments may be:
HEALTHY: all replicas are acting normally and passing their health checks.
UNHEALTHY: at least one replica is not acting normally and may not be passing its health check.
UPDATING: the deployment is updating.
serve status [OPTIONS]
Options
- -a, --address <address>#
Address to use to query the Ray dashboard agent (defaults to http://localhost:52365). Can also be specified using the RAY_AGENT_ADDRESS environment variable.