Ray Job Submission API Reference

For an overview with examples see Ray Jobs.

Job Submission CLI

ray job submit

Submits a job to be run on the cluster.

Example:

ray job submit – python my_script.py –arg=val

PublicAPI: This API is stable across Ray releases.

ray job submit [OPTIONS] ENTRYPOINT...

Options

--address <address>

Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.

--job-id <job_id>

DEPRECATED: Use – submission-id instead.

--submission-id <submission_id>

Submission ID to specify for the job. If not provided, one will be generated.

--runtime-env <runtime_env>

Path to a local YAML file containing a runtime_env definition.

--runtime-env-json <runtime_env_json>

JSON-serialized runtime_env dictionary.

--working-dir <working_dir>

Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). If specified, this overrides the option in –runtime-env.

--no-wait

If set, will not stream logs and wait for the job to exit.

--log-style <log_style>

If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.

Options

auto | record | pretty

--log-color <log_color>

Use color logging. Auto enables color logging if stdout is a TTY.

Options

auto | false | true

-v, --verbose

Arguments

ENTRYPOINT

Required argument(s)

Warning

When using the CLI, do not wrap the entrypoint command in quotes. For example, use ray job submit --working_dir="." -- python script.py instead of ray job submit --working_dir="." -- "python script.py". Otherwise you may encounter the error /bin/sh: 1: python script.py: not found.

ray job status

Queries for the current status of a job.

Example:

ray job status <my_job_id>

PublicAPI (beta): This API is in beta and may change before becoming stable.

ray job status [OPTIONS] JOB_ID

Options

--address <address>

Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.

--log-style <log_style>

If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.

Options

auto | record | pretty

--log-color <log_color>

Use color logging. Auto enables color logging if stdout is a TTY.

Options

auto | false | true

-v, --verbose

Arguments

JOB_ID

Required argument

ray job stop

Attempts to stop a job.

Example:

ray job stop <my_job_id>

PublicAPI (beta): This API is in beta and may change before becoming stable.

ray job stop [OPTIONS] JOB_ID

Options

--address <address>

Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.

--no-wait

If set, will not wait for the job to exit.

--log-style <log_style>

If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.

Options

auto | record | pretty

--log-color <log_color>

Use color logging. Auto enables color logging if stdout is a TTY.

Options

auto | false | true

-v, --verbose

Arguments

JOB_ID

Required argument

ray job logs

Gets the logs of a job.

Example:

ray job logs <my_job_id>

PublicAPI (beta): This API is in beta and may change before becoming stable.

ray job logs [OPTIONS] JOB_ID

Options

--address <address>

Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.

-f, --follow

If set, follow the logs (like tail -f).

--log-style <log_style>

If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.

Options

auto | record | pretty

--log-color <log_color>

Use color logging. Auto enables color logging if stdout is a TTY.

Options

auto | false | true

-v, --verbose

Arguments

JOB_ID

Required argument

ray job list

Lists all running jobs and their information.

Example:

ray job list

PublicAPI (beta): This API is in beta and may change before becoming stable.

ray job list [OPTIONS]

Options

--address <address>

Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.

--log-style <log_style>

If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.

Options

auto | record | pretty

--log-color <log_color>

Use color logging. Auto enables color logging if stdout is a TTY.

Options

auto | false | true

-v, --verbose

Job Submission SDK

JobSubmissionClient

class ray.job_submission.JobSubmissionClient(address: Optional[str] = None, create_cluster_if_needed: bool = False, cookies: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, Any]] = None, headers: Optional[Dict[str, Any]] = None)[source]

A local client for submitting and interacting with jobs on a remote cluster.

Submits requests over HTTP to the job server on the cluster using the REST API.

submit_job(*, entrypoint: str, job_id: Optional[str] = None, runtime_env: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, str]] = None, submission_id: Optional[str] = None) str[source]

Submit and execute a job asynchronously.

When a job is submitted, it runs once to completion or failure. Retries or different runs with different parameters should be handled by the submitter. Jobs are bound to the lifetime of a Ray cluster, so if the cluster goes down, all running jobs on that cluster will be terminated.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job( 
...     entrypoint="python script.py",
...     runtime_env={
...         "working_dir": "./",
...         "pip": ["requests==2.26.0"]
...     }
... )  
'raysubmit_4LamXRuQpYdSMg7J'
Args:

entrypoint: The shell command to run for this job. submission_id: A unique ID for this job. runtime_env: The runtime environment to install and run this job in. metadata: Arbitrary data to store along with this job. job_id: DEPRECATED. This has been renamed to submission_id

Returns:

The submission ID of the submitted job. If not specified, this is a randomly generated unique ID.

Raises:

RuntimeError: If the request to the job server fails, or if the specified submission_id has already been used by a job on this cluster.

PublicAPI (beta): This API is in beta and may change before becoming stable.

stop_job(job_id: str) bool[source]

Request a job to exit asynchronously.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> sub_id = client.submit_job(entrypoint="sleep 10") 
>>> client.stop_job(sub_id) 
True
Args:

job_id: The job ID or submission ID for the job to be stopped.

Returns:

True if the job was running, otherwise False.

Raises:

RuntimeError: If the job does not exist or if the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_info(job_id: str) ray.dashboard.modules.job.pydantic_models.JobDetails[source]

Get the latest status and other information associated with a job.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> submission_id = client.submit_job(entrypoint="sleep 1") 
>>> job_submission_client.get_job_info(submission_id) 
JobInfo(status='SUCCEEDED', message='Job finished successfully.',
error_type=None, start_time=1647388711, end_time=1647388712,
metadata={}, runtime_env={})
Args:

job_id: The job ID or submission ID of the job whose information is being requested.

Returns:

The JobInfo for the job.

Raises:

RuntimeError: If the job does not exist or if the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

list_jobs() List[ray.dashboard.modules.job.pydantic_models.JobDetails][source]

List all jobs along with their status and other information.

Lists all jobs that have ever run on the cluster, including jobs that are currently running and jobs that are no longer running.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job(entrypoint="echo hello") 
>>> client.submit_job(entrypoint="sleep 2") 
>>> client.list_jobs() 
[JobDetails(status='SUCCEEDED',
job_id='03000000', type='submission',
submission_id='raysubmit_4LamXRuQpYdSMg7J',
message='Job finished successfully.', error_type=None,
start_time=1647388711, end_time=1647388712, metadata={}, runtime_env={}),
JobDetails(status='RUNNING',
job_id='04000000', type='submission',
submission_id='raysubmit_1dxCeNvG1fCMVNHG',
message='Job is currently running.', error_type=None,
start_time=1647454832, end_time=None, metadata={}, runtime_env={})]
Returns:

A dictionary mapping job_ids to their information.

Raises:

RuntimeError: If the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_status(job_id: str) ray.dashboard.modules.job.common.JobStatus[source]

Get the most recent status of a job.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job(entrypoint="echo hello") 
>>> client.get_job_status("raysubmit_4LamXRuQpYdSMg7J") 
'SUCCEEDED'
Args:

job_id: The job ID or submission ID of the job whose status is being requested.

Returns:

The JobStatus of the job.

Raises:

RuntimeError: If the job does not exist or if the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_logs(job_id: str) str[source]

Get all logs produced by a job.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> sub_id = client.submit_job(entrypoint="echo hello") 
>>> client.get_job_logs(sub_id) 
'hello\n'
Args:

job_id: The job ID or submission ID of the job whose logs are being requested.

Returns:

A string containing the full logs of the job.

Raises:

RuntimeError: If the job does not exist or if the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

async tail_job_logs(job_id: str) Iterator[str][source]

Get an iterator that follows the logs of a job.

Example:
>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> submission_id = client.submit_job( 
...     entrypoint="echo hi && sleep 5 && echo hi2")
>>> async for lines in client.tail_job_logs( 
...           'raysubmit_Xe7cvjyGJCyuCvm2'):
...     print(lines, end="") 
hi
hi2
Args:

job_id: The job ID or submission ID of the job whose logs are being requested.

Returns:

The iterator.

Raises:

RuntimeError: If the job does not exist or if the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

JobStatus

class ray.job_submission.JobStatus(value)[source]

An enumeration for describing the status of a job.

PENDING = 'PENDING'

The job has not started yet, likely waiting for the runtime_env to be set up.

RUNNING = 'RUNNING'

The job is currently running.

STOPPED = 'STOPPED'

The job was intentionally stopped by the user.

SUCCEEDED = 'SUCCEEDED'

The job finished successfully.

FAILED = 'FAILED'

The job failed.

is_terminal() bool[source]

Return whether or not this status is terminal.

A terminal status is one that cannot transition to any other status. The terminal statuses are “STOPPED”, “SUCCEEDED”, and “FAILED”.

Returns

True if this status is terminal, otherwise False.

JobInfo

class ray.job_submission.JobInfo(status: ray.dashboard.modules.job.common.JobStatus, entrypoint: str, message: Optional[str] = None, error_type: Optional[str] = None, start_time: Optional[int] = None, end_time: Optional[int] = None, metadata: Optional[Dict[str, str]] = None, runtime_env: Optional[Dict[str, Any]] = None)[source]

A class for recording information associated with a job and its execution.

status: ray.dashboard.modules.job.common.JobStatus

The status of the job.

entrypoint: str

The entrypoint command for this job.

message: Optional[str] = None

A message describing the status in more detail.

start_time: Optional[int] = None

The time when the job was started. A Unix timestamp in ms.

end_time: Optional[int] = None

The time when the job moved into a terminal state. A Unix timestamp in ms.

metadata: Optional[Dict[str, str]] = None

Arbitrary user-provided metadata for the job.

runtime_env: Optional[Dict[str, Any]] = None

The runtime environment for the job.