ray.util.state.common.WorkerState#

class ray.util.state.common.WorkerState(worker_id: str, is_alive: bool, worker_type: Literal['WORKER', 'DRIVER', 'SPILL_WORKER', 'RESTORE_WORKER'], exit_type: Literal['SYSTEM_ERROR', 'INTENDED_SYSTEM_EXIT', 'USER_ERROR', 'INTENDED_USER_EXIT', 'NODE_OUT_OF_MEMORY'] | None, node_id: str, ip: str, pid: int, exit_detail: str | None = None, worker_launch_time_ms: int | None = None, worker_launched_time_ms: int | None = None, start_time_ms: int | None = None, end_time_ms: int | None = None, debugger_port: int | None = None, num_paused_threads: int | None = None)[source]#

Bases: StateSchema

Worker State

Below columns can be used for the --filter option.

pid

is_alive

worker_type

exit_type

debugger_port

ip

node_id

num_paused_threads

worker_id

Below columns are available only when get API is used,

--detail is specified through CLI, or detail=True is given to Python APIs.

pid

is_alive

worker_type

exit_type

debugger_port

ip

end_time_ms

node_id

exit_detail

worker_launched_time_ms

num_paused_threads

worker_id

start_time_ms

worker_launch_time_ms

worker_id: str#: The id of the worker.

is_alive: bool#: Whether or not if the worker is alive.

worker_type: Literal['WORKER', 'DRIVER', 'SPILL_WORKER', 'RESTORE_WORKER']#

The driver (Python script that calls ray.init). - SPILL_WORKER: The worker that spills objects. - RESTORE_WORKER: The worker that restores objects.

Type:

DRIVER

exit_type: Literal['SYSTEM_ERROR', 'INTENDED_SYSTEM_EXIT', 'USER_ERROR', 'INTENDED_USER_EXIT', 'NODE_OUT_OF_MEMORY'] | None#

The exit type of the worker if the worker is dead.

SYSTEM_ERROR: Worker exit due to system level failures (i.e. worker crash).
INTENDED_SYSTEM_EXIT: System-level exit that is intended. E.g., Workers are killed because they are idle for a long time.
USER_ERROR: Worker exits because of user error. E.g., execptions from the actor initialization.
INTENDED_USER_EXIT: Intended exit from users (e.g., users exit workers with exit code 0 or exit initated by Ray API such as ray.kill).

node_id: str#: The node id of the worker.

ip: str#: The ip address of the worker.

pid: int#: The pid of the worker.

exit_detail: str | None = None#: The exit detail of the worker if the worker is dead.

worker_launch_time_ms: int | None = None#: The time worker is first launched. -1 if the value doesn’t exist. The lifecycle of worker is as follow. worker_launch_time_ms (process startup requested). -> worker_launched_time_ms (process started). -> start_time_ms (worker is ready to be used). -> end_time_ms (worker is destroyed).

worker_launched_time_ms: int | None = None#: The time worker is successfully launched -1 if the value doesn’t exist.

start_time_ms: int | None = None#: The time when the worker is started and initialized. 0 if the value doesn’t exist.

end_time_ms: int | None = None#: The time when the worker exits. The timestamp could be delayed if the worker is dead unexpectedly. 0 if the value doesn’t exist.