Ray Job Submission API¶
For an overview with examples see Ray Job Submission.
Job Submission CLI¶
ray job submit¶
Submits a job to be run on the cluster.
- Example:
ray job submit – python my_script.py –arg=val
PublicAPI: This API is stable across Ray releases.
ray job submit [OPTIONS] ENTRYPOINT...
Options
- --address <address>¶
Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.
- --job-id <job_id>¶
Job ID to specify for the job. If not provided, one will be generated.
- --runtime-env <runtime_env>¶
Path to a local YAML file containing a runtime_env definition.
- --runtime-env-json <runtime_env_json>¶
JSON-serialized runtime_env dictionary.
- --working-dir <working_dir>¶
Directory containing files that your job will run in. Can be a local directory or a remote URI to a .zip file (S3, GS, HTTP). If specified, this overrides the option in –runtime-env.
- --no-wait¶
If set, will not stream logs and wait for the job to exit.
- --log-style <log_style>¶
If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.
- Options
auto | record | pretty
- --log-color <log_color>¶
Use color logging. Auto enables color logging if stdout is a TTY.
- Options
auto | false | true
- -v, --verbose¶
Arguments
- ENTRYPOINT¶
Required argument(s)
ray job status¶
Queries for the current status of a job.
- Example:
ray job status <my_job_id>
PublicAPI (beta): This API is in beta and may change before becoming stable.
ray job status [OPTIONS] JOB_ID
Options
- --address <address>¶
Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.
- --log-style <log_style>¶
If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.
- Options
auto | record | pretty
- --log-color <log_color>¶
Use color logging. Auto enables color logging if stdout is a TTY.
- Options
auto | false | true
- -v, --verbose¶
Arguments
- JOB_ID¶
Required argument
ray job stop¶
Attempts to stop a job.
- Example:
ray job stop <my_job_id>
PublicAPI (beta): This API is in beta and may change before becoming stable.
ray job stop [OPTIONS] JOB_ID
Options
- --address <address>¶
Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.
- --no-wait¶
If set, will not wait for the job to exit.
- --log-style <log_style>¶
If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.
- Options
auto | record | pretty
- --log-color <log_color>¶
Use color logging. Auto enables color logging if stdout is a TTY.
- Options
auto | false | true
- -v, --verbose¶
Arguments
- JOB_ID¶
Required argument
ray job logs¶
Gets the logs of a job.
- Example:
ray job logs <my_job_id>
PublicAPI (beta): This API is in beta and may change before becoming stable.
ray job logs [OPTIONS] JOB_ID
Options
- --address <address>¶
Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.
- -f, --follow¶
If set, follow the logs (like tail -f).
- --log-style <log_style>¶
If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.
- Options
auto | record | pretty
- --log-color <log_color>¶
Use color logging. Auto enables color logging if stdout is a TTY.
- Options
auto | false | true
- -v, --verbose¶
Arguments
- JOB_ID¶
Required argument
ray job list¶
Lists all running jobs and their information.
- Example:
ray job list
PublicAPI (beta): This API is in beta and may change before becoming stable.
ray job list [OPTIONS]
Options
- --address <address>¶
Address of the Ray cluster to connect to. Can also be specified using the RAY_ADDRESS environment variable.
- --log-style <log_style>¶
If ‘pretty’, outputs with formatting and color. If ‘record’, outputs record-style without formatting. ‘auto’ defaults to ‘pretty’, and disables pretty logging if stdin is not a TTY.
- Options
auto | record | pretty
- --log-color <log_color>¶
Use color logging. Auto enables color logging if stdout is a TTY.
- Options
auto | false | true
- -v, --verbose¶
Job Submission SDK¶
JobSubmissionClient¶
- class ray.job_submission.JobSubmissionClient(address: Optional[str] = None, create_cluster_if_needed: bool = False, cookies: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, Any]] = None, headers: Optional[Dict[str, Any]] = None)[source]¶
A local client for submitting and interacting with jobs on a remote cluster.
Submits requests over HTTP to the job server on the cluster using the REST API.
- submit_job(*, entrypoint: str, job_id: Optional[str] = None, runtime_env: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, str]] = None) str [source]¶
Submit and execute a job asynchronously.
When a job is submitted, it runs once to completion or failure. Retries or different runs with different parameters should be handled by the submitter. Jobs are bound to the lifetime of a Ray cluster, so if the cluster goes down, all running jobs on that cluster will be terminated.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> client.submit_job( ... entrypoint="python script.py", ... runtime_env={ ... "working_dir": "./", ... "pip": ["requests==2.26.0"] ... } ... ) 'raysubmit_4LamXRuQpYdSMg7J'
- Args:
entrypoint: The shell command to run for this job. job_id: A unique ID for this job. runtime_env: The runtime environment to install and run this job in. metadata: Arbitrary data to store along with this job.
- Returns:
The job ID of the submitted job. If not specified, this is a randomly generated unique ID.
- Raises:
RuntimeError: If the request to the job server fails, or if the specified job_id has already been used by a job on this cluster.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- stop_job(job_id: str) bool [source]¶
Request a job to exit asynchronously.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> job_id = client.submit_job(entrypoint="sleep 10") >>> client.stop_job(job_id) True
- Args:
job_id: The job ID for the job to be stopped.
- Returns:
True if the job was running, otherwise False.
- Raises:
RuntimeError: If the job does not exist or if the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- get_job_info(job_id: str) ray.dashboard.modules.job.common.JobInfo [source]¶
Get the latest status and other information associated with a job.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> job_id = client.submit_job(entrypoint="sleep 1") >>> job_submission_client.get_job_info(job_id) JobInfo(status='SUCCEEDED', message='Job finished successfully.', error_type=None, start_time=1647388711, end_time=1647388712, metadata={}, runtime_env={})
- Args:
job_id: The ID of the job whose information is being requested.
- Returns:
The JobInfo for the job.
- Raises:
RuntimeError: If the job does not exist or if the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- list_jobs() Dict[str, ray.dashboard.modules.job.common.JobInfo] [source]¶
List all jobs along with their status and other information.
Lists all jobs that have ever run on the cluster, including jobs that are currently running and jobs that are no longer running.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> client.submit_job(entrypoint="echo hello") >>> client.submit_job(entrypoint="sleep 2") >>> client.list_jobs() {'raysubmit_4LamXRuQpYdSMg7J': JobInfo(status='SUCCEEDED', message='Job finished successfully.', error_type=None, start_time=1647388711, end_time=1647388712, metadata={}, runtime_env={}), 'raysubmit_1dxCeNvG1fCMVNHG': JobInfo(status='RUNNING', message='Job is currently running.', error_type=None, start_time=1647454832, end_time=None, metadata={}, runtime_env={})}
- Returns:
A dictionary mapping job_ids to their information.
- Raises:
RuntimeError: If the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- get_job_status(job_id: str) ray.dashboard.modules.job.common.JobStatus [source]¶
Get the most recent status of a job.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> client.submit_job(entrypoint="echo hello") >>> client.get_job_status("raysubmit_4LamXRuQpYdSMg7J") 'SUCCEEDED'
- Args:
job_id: The ID of the job whose status is being requested.
- Returns:
The JobStatus of the job.
- Raises:
RuntimeError: If the job does not exist or if the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- get_job_logs(job_id: str) str [source]¶
Get all logs produced by a job.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> job_id = client.submit_job(entrypoint="echo hello") >>> client.get_job_logs(job_id) 'hello\n'
- Args:
job_id: The ID of the job whose logs are being requested.
- Returns:
A string containing the full logs of the job.
- Raises:
RuntimeError: If the job does not exist or if the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- async tail_job_logs(job_id: str) Iterator[str] [source]¶
Get an iterator that follows the logs of a job.
- Example:
>>> from ray.job_submission import JobSubmissionClient >>> client = JobSubmissionClient("http://127.0.0.1:8265") >>> job_id = client.submit_job( ... entrypoint="echo hi && sleep 5 && echo hi2") >>> async for lines in client.tail_job_logs( ... 'raysubmit_Xe7cvjyGJCyuCvm2'): ... print(lines, end="") hi hi2
- Args:
job_id: The ID of the job whose logs are being requested.
- Returns:
The iterator.
- Raises:
RuntimeError: If the job does not exist or if the request to the job server fails.
PublicAPI (beta): This API is in beta and may change before becoming stable.
JobStatus¶
- class ray.job_submission.JobStatus(value)[source]¶
An enumeration for describing the status of a job.
- PENDING = 'PENDING'¶
The job has not started yet, likely waiting for the runtime_env to be set up.
- RUNNING = 'RUNNING'¶
The job is currently running.
- STOPPED = 'STOPPED'¶
The job was intentionally stopped by the user.
- SUCCEEDED = 'SUCCEEDED'¶
The job finished successfully.
- FAILED = 'FAILED'¶
The job failed.
JobInfo¶
- class ray.job_submission.JobInfo(status: ray.dashboard.modules.job.common.JobStatus, entrypoint: str, message: Optional[str] = None, error_type: Optional[str] = None, start_time: Optional[int] = None, end_time: Optional[int] = None, metadata: Optional[Dict[str, str]] = None, runtime_env: Optional[Dict[str, Any]] = None)[source]¶
A class for recording information associated with a job and its execution.
- status: ray.dashboard.modules.job.common.JobStatus¶
The status of the job.
- entrypoint: str¶
The entrypoint command for this job.
- message: Optional[str] = None¶
A message describing the status in more detail.
- start_time: Optional[int] = None¶
The time when the job was started. A Unix timestamp in ms.
- end_time: Optional[int] = None¶
The time when the job moved into a terminal state. A Unix timestamp in ms.
- metadata: Optional[Dict[str, str]] = None¶
Arbitrary user-provided metadata for the job.
- runtime_env: Optional[Dict[str, Any]] = None¶
The runtime environment for the job.