Ray is an open-source unified framework for scaling AI and Python applications like machine learning. It provides the compute layer for parallel processing so that you don’t need to be a distributed systems expert. Ray minimizes the complexity of running your distributed individual and end-to-end machine learning workflows with these components:

  • Scalable libraries for common machine learning tasks such as data preprocessing, distributed training, hyperparameter tuning, reinforcement learning, and model serving.

  • Pythonic distributed computing primitives for parallelizing and scaling Python applications.

  • Integrations and utilities for integrating and deploying a Ray cluster with existing tools and infrastructure such as Kubernetes, AWS, GCP, and Azure.

For data scientists and machine learning practitioners, Ray lets you scale jobs without needing infrastructure expertise:

  • Easily parallelize and distribute workloads across multiple nodes and GPUs.

  • Quickly configure and access cloud compute resources.

  • Leverage the ML ecosystem with native and extensible integrations.

For distributed systems engineers, Ray automatically handles key processes:

  • Orchestration–Managing the various components of a distributed system.

  • Scheduling–Coordinating when and where tasks are executed.

  • Fault tolerance–Ensuring tasks complete regardless of inevitable points of failure.

  • Auto-scaling–Adjusting the number of resources allocated to dynamic demand.

What you can do with Ray#

These are some common ML workloads that individuals, organizations, and companies leverage Ray to build their AI applications:

Ray framework#


Stack of Ray libraries - unified toolkit for ML workloads.

Ray’s unified compute framework comprises of three layers:

  1. Ray AI Runtime–An open-source, Python, domain-specific set of libraries that equip ML engineers, data scientists, and researchers with a scalable and unified toolkit for ML applications.

  2. Ray Core–An open-source, Python, general purpose, distributed computing library that enables ML engineers and Python developers to scale Python applications and accelerate machine learning workloads.

  3. Ray cluster–A set of worker nodes connected to a common Ray head node. Ray clusters can be fixed-size, or they can autoscale up and down according to the resources requested by applications running on the cluster.

Scale machine learning workloads

Build ML applications with a toolkit of libraries for distributed data processing, model training, tuning, reinforcement learning, model serving, and more.

Build distributed applications

Build and run distributed applications with a simple and flexible API. Parallelize single machine code with little to zero code changes.

Deploy large-scale workloads

Deploy workloads on AWS, GCP, Azure or on premise. Use Ray cluster managers to run Ray on existing Kubernetes, YARN, or Slurm clusters.

Each of Ray AIR’s five native libraries distributes a specific ML task:

  • Data: Scalable, framework-agnostic data loading and transformation across training, tuning, and prediction.

  • Train: Distributed multi-node and multi-core model training with fault tolerance that integrates with popular training libraries.

  • Tune: Scalable hyperparameter tuning to optimize model performance.

  • Serve: Scalable and programmable serving to deploy models for online inference, with optional microbatching to improve performance.

  • RLlib: Scalable distributed reinforcement learning workloads that integrate with the other Ray AIR libraries.

For custom applications, the Ray Core library enables Python developers to easily build scalable, distributed systems that can run on a laptop, cluster, cloud, or Kubernetes. It’s the foundation that Ray AIR and third-party integrations (Ray ecosystem) are built on.

Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations.