Ray State CLI

State

This section contains commands to access the live state of Ray resources (actor, task, object, etc.).

Note

APIs are alpha. This feature requires a full installation of Ray using pip install "ray[default]".

State CLI allows users to access the state of various resources (e.g., actor, task, object).

ray summary tasks

Summarize the task state of the cluster.

By default, the output contains the information grouped by task function names.

The output schema is ray.experimental.state.common.TaskSummaries.

Raises:
RayStateApiException

if the CLI is failed to query the data.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray summary tasks [OPTIONS]

Options

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

ray summary actors

Summarize the actor state of the cluster.

By default, the output contains the information grouped by actor class names.

The output schema is ray.experimental.state.common.ActorSummaries.

Raises:
RayStateApiException

if the CLI is failed to query the data.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray summary actors [OPTIONS]

Options

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

ray summary objects

Summarize the object state of the cluster.

The API is recommended when debugging memory leaks. See Debugging with Ray Memory for more details. (Note that this command is almost equivalent to ray memory, but it returns easier-to-understand output).

By default, the output contains the information grouped by object callsite. Note that the callsite is not collected and all data will be aggregated as “disable” callsite if the env var RAY_record_ref_creation_sites is not configured. To enable the callsite collection, set the following environment variable when starting Ray.

Example:

` RAY_record_ref_creation_sites=1 ray start --head `

` RAY_record_ref_creation_sites=1 ray_script.py `

The output schema is ray.experimental.state.common.ObjectSummaries.

Raises:
RayStateApiException

if the CLI is failed to query the data.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray summary objects [OPTIONS]

Options

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

ray list

List all states of a given resource.

Normally, summary APIs are recommended before listing all resources.

The output schema is defined at State API Schema section.

For example, the output schema of ray list tasks is ray.experimental.state.common.TaskState.

Usage:

List all actor information from the cluster.

` ray list actors `

List 50 actors from the cluster. The sorting order cannot be controlled.

` ray list actors --limit 50 `

List 10 actors with state PENDING.

` ray list actors --limit 10 --filter "state=PENDING" `

List actors with yaml format.

` ray list actors --format yaml `

List actors with details. When –detail is specified, it might query more data sources to obtain data in details.

` ray list actors --detail `

The API queries one or more components from the cluster to obtain the data. The returned state snapshot could be stale, and it is not guaranteed to return the live data.

The API can return partial or missing output upon the following scenarios.

  • When the API queries more than 1 component, if some of them fail, the API will return the partial result (with a suppressible warning).

  • When the API returns too many entries, the API will truncate the output. Currently, truncated data cannot be selected by users.

Args:

resource: The type of the resource to query.

Raises:
RayStateApiException

if the CLI is failed to query the data.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray list [OPTIONS] [actors|jobs|placement-
         groups|nodes|workers|tasks|objects|runtime-envs]

Options

--format <format>
Options

default | json | yaml | table

-f, --filter <filter>

A key, predicate, and value to filter the result. E.g., –filter ‘key=value’ or –filter ‘key!=value’. You can specify multiple –filter options. In this case all predicates are concatenated as AND. For example, –filter key=value –filter key2=value means (key==val) AND (key2==val2)

--limit <limit>

Maximum number of entries to return. 100 by default.

--detail

If the flag is set, the output will contain data in more details. Note that the API could query more sources to obtain information in a greater detail.

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

Arguments

RESOURCE

Required argument

ray get

Get a state of a given resource by ID.

We currently DO NOT support get by id for jobs and runtime-envs

The output schema is defined at State API Schema section.

For example, the output schema of ray get tasks is ray.experimental.state.common.TaskState.

Usage:

Get an actor with actor id <actor-id>

` ray get actors <actor-id> `

Get a placement group information with <placement-group-id>

` ray get placement-groups <placement-group-id> `

The API queries one or more components from the cluster to obtain the data. The returned state snapshot could be stale, and it is not guaranteed to return the live data.

Args:

resource: The type of the resource to query. id: The id of the resource.

Raises:
RayStateApiException

if the CLI is failed to query the data.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray get [OPTIONS] [actors|placement-groups|nodes|workers|tasks|objects] ID

Options

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30

Arguments

RESOURCE

Required argument

ID

Required argument

Log

This section contains commands to access logs from Ray clusters.

Note

APIs are alpha. This feature requires a full installation of Ray using pip install "ray[default]".

Log CLI allows users to access the log from the cluster. Note that only the logs from alive nodes are available through this API.

ray logs

Print the log file that matches the GLOB_FILTER.

By default, it prints a list of log files that match the filter. If there’s only 1 match, it will print the log file. By default, it prints the head node logs.

Usage:

Print the last 500 lines of raylet.out on a head node.

` ray logs raylet.out -tail 500 `

Print the last 500 lines of raylet.out on a worker node id A.

` ray logs raylet.out -tail 500 —-node-id A `

Follow the log file with an actor id ABC.

` ray logs --actor-id ABC --follow `

Get the actor log from pid 123, ip ABC. Note that this goes well with the driver log of Ray which prints (ip=ABC, pid=123, class_name) logs.

` ray logs —ip=ABC pid=123 `

Download the gcs_server.txt file to the local machine.

` ray logs gcs_server.out -tail -1 > gcs_server.txt `

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

ray logs [OPTIONS] [GLOB_FILTER]

Options

-ip, --node-ip <node_ip>

Filters the logs by this ip address.

-id, --node-id <node_id>

Filters the logs by this NodeID.

-pid, --pid <pid>

Retrieves the logs from the process with this pid.

-a, --actor-id <actor_id>

Retrieves the logs corresponding to this ActorID.

-t, --task-id <task_id>

Retrieves the logs corresponding to this TaskID.

-f, --follow

Streams the log file as it is updated instead of just tailing.

--tail <tail>

Number of lines to tail from log. -1 indicates fetching the whole file.

--timeout <timeout>

Timeout in seconds for the API requests. Default is 30. If –follow is specified, this option will be ignored.

--address <address>

The address of Ray API server. If not provided, it will be configured automatically from querying the GCS server.

Arguments

GLOB_FILTER

Optional argument