Ray Job Submission API ReferenceΒΆ

For an overview with examples see Ray Jobs.

For the CLI reference see Ray Job Submission CLI Reference.

Job Submission SDKΒΆ

JobSubmissionClientΒΆ

class ray.job_submission.JobSubmissionClient(address: Optional[str] = None, create_cluster_if_needed: bool = False, cookies: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, Any]] = None, headers: Optional[Dict[str, Any]] = None)[source]

A local client for submitting and interacting with jobs on a remote cluster.

Submits requests over HTTP to the job server on the cluster using the REST API.

submit_job(*, entrypoint: str, job_id: Optional[str] = None, runtime_env: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, str]] = None, submission_id: Optional[str] = None) str[source]

Submit and execute a job asynchronously.

When a job is submitted, it runs once to completion or failure. Retries or different runs with different parameters should be handled by the submitter. Jobs are bound to the lifetime of a Ray cluster, so if the cluster goes down, all running jobs on that cluster will be terminated.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job( 
...     entrypoint="python script.py",
...     runtime_env={
...         "working_dir": "./",
...         "pip": ["requests==2.26.0"]
...     }
... )  
'raysubmit_4LamXRuQpYdSMg7J'
Parameters
  • entrypoint – The shell command to run for this job.

  • submission_id – A unique ID for this job.

  • runtime_env – The runtime environment to install and run this job in.

  • metadata – Arbitrary data to store along with this job.

  • job_id – DEPRECATED. This has been renamed to submission_id

Returns

The submission ID of the submitted job. If not specified, this is a randomly generated unique ID.

Raises
  • RuntimeError – If the request to the job server fails, or if the specified

  • submission_id has already been used by a job on this cluster. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

stop_job(job_id: str) bool[source]

Request a job to exit asynchronously.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> sub_id = client.submit_job(entrypoint="sleep 10") 
>>> client.stop_job(sub_id) 
True
Parameters

job_id – The job ID or submission ID for the job to be stopped.

Returns

True if the job was running, otherwise False.

Raises
  • RuntimeError – If the job does not exist or if the request to the

  • job server fails. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_info(job_id: str) ray.dashboard.modules.job.pydantic_models.JobDetails[source]

Get the latest status and other information associated with a job.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> submission_id = client.submit_job(entrypoint="sleep 1") 
>>> job_submission_client.get_job_info(submission_id) 
JobInfo(status='SUCCEEDED', message='Job finished successfully.',
error_type=None, start_time=1647388711, end_time=1647388712,
metadata={}, runtime_env={})
Parameters
  • job_id – The job ID or submission ID of the job whose information

  • requested. (is being) –

Returns

The JobInfo for the job.

Raises
  • RuntimeError – If the job does not exist or if the request to the

  • job server fails. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

list_jobs() List[ray.dashboard.modules.job.pydantic_models.JobDetails][source]

List all jobs along with their status and other information.

Lists all jobs that have ever run on the cluster, including jobs that are currently running and jobs that are no longer running.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job(entrypoint="echo hello") 
>>> client.submit_job(entrypoint="sleep 2") 
>>> client.list_jobs() 
[JobDetails(status='SUCCEEDED',
job_id='03000000', type='submission',
submission_id='raysubmit_4LamXRuQpYdSMg7J',
message='Job finished successfully.', error_type=None,
start_time=1647388711, end_time=1647388712, metadata={}, runtime_env={}),
JobDetails(status='RUNNING',
job_id='04000000', type='submission',
submission_id='raysubmit_1dxCeNvG1fCMVNHG',
message='Job is currently running.', error_type=None,
start_time=1647454832, end_time=None, metadata={}, runtime_env={})]
Returns

A dictionary mapping job_ids to their information.

Raises

RuntimeError – If the request to the job server fails.

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_status(job_id: str) ray.dashboard.modules.job.common.JobStatus[source]

Get the most recent status of a job.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> client.submit_job(entrypoint="echo hello") 
>>> client.get_job_status("raysubmit_4LamXRuQpYdSMg7J") 
'SUCCEEDED'
Parameters
  • job_id – The job ID or submission ID of the job whose status is being

  • requested. –

Returns

The JobStatus of the job.

Raises
  • RuntimeError – If the job does not exist or if the request to the

  • job server fails. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

get_job_logs(job_id: str) str[source]

Get all logs produced by a job.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> sub_id = client.submit_job(entrypoint="echo hello") 
>>> client.get_job_logs(sub_id) 
'hello\n'
Parameters
  • job_id – The job ID or submission ID of the job whose logs are being

  • requested. –

Returns

A string containing the full logs of the job.

Raises
  • RuntimeError – If the job does not exist or if the request to the

  • job server fails. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

async tail_job_logs(job_id: str) Iterator[str][source]

Get an iterator that follows the logs of a job.

Example

>>> from ray.job_submission import JobSubmissionClient
>>> client = JobSubmissionClient("http://127.0.0.1:8265") 
>>> submission_id = client.submit_job( 
...     entrypoint="echo hi && sleep 5 && echo hi2")
>>> async for lines in client.tail_job_logs( 
...           'raysubmit_Xe7cvjyGJCyuCvm2'):
...     print(lines, end="") 
hi
hi2
Parameters
  • job_id – The job ID or submission ID of the job whose logs are being

  • requested. –

Returns

The iterator.

Raises
  • RuntimeError – If the job does not exist or if the request to the

  • job server fails. –

PublicAPI (beta): This API is in beta and may change before becoming stable.

JobStatusΒΆ

class ray.job_submission.JobStatus(value)[source]

An enumeration for describing the status of a job.

PENDING = 'PENDING'

The job has not started yet, likely waiting for the runtime_env to be set up.

RUNNING = 'RUNNING'

The job is currently running.

STOPPED = 'STOPPED'

The job was intentionally stopped by the user.

SUCCEEDED = 'SUCCEEDED'

The job finished successfully.

FAILED = 'FAILED'

The job failed.

is_terminal() bool[source]

Return whether or not this status is terminal.

A terminal status is one that cannot transition to any other status. The terminal statuses are β€œSTOPPED”, β€œSUCCEEDED”, and β€œFAILED”.

Returns

True if this status is terminal, otherwise False.

JobInfoΒΆ

class ray.job_submission.JobInfo(status: ray.dashboard.modules.job.common.JobStatus, entrypoint: str, message: Optional[str] = None, error_type: Optional[str] = None, start_time: Optional[int] = None, end_time: Optional[int] = None, metadata: Optional[Dict[str, str]] = None, runtime_env: Optional[Dict[str, Any]] = None)[source]

A class for recording information associated with a job and its execution.

status: ray.dashboard.modules.job.common.JobStatus

The status of the job.

entrypoint: str

The entrypoint command for this job.

message: Optional[str] = None

A message describing the status in more detail.

start_time: Optional[int] = None

The time when the job was started. A Unix timestamp in ms.

end_time: Optional[int] = None

The time when the job moved into a terminal state. A Unix timestamp in ms.

metadata: Optional[Dict[str, str]] = None

Arbitrary user-provided metadata for the job.

runtime_env: Optional[Dict[str, Any]] = None

The runtime environment for the job.