logo

Ray 1.13.0

Overview

  • Getting Started Guide
  • Installing Ray
  • Ecosystem

Ray ML

  • Ray Data
    • Getting Started
    • Key Concepts
    • User Guides
      • Creating Datasets
      • Saving Datasets
      • Transforming Datasets
      • Accessing Datasets
      • Pipelining Compute
      • ML Preprocessing
      • Working with Tensors
      • Advanced Pipeline Usage
      • Random Data Access (Experimental)
      • Using Custom Datasources
      • Performance Tips and Tuning
    • Examples
      • Processing the NYC taxi dataset
      • Large-scale ML Ingest
    • FAQ
    • Ray Datasets API
    • Integrations
      • Using Dask on Ray
      • Using Spark on Ray (RayDP)
      • Using Mars on Ray
      • Using Pandas on Ray (Modin)
  • Ray Train
    • Ray Train User Guide
    • Ray Train Examples
    • Ray Train FAQ
    • Ray Train Architecture
    • Ray Train API
  • Ray Tune
    • Getting Started
    • Key Concepts
    • User Guides
      • How Tune Works
      • How to Stop and Resume
      • Using Callbacks and Metrics
      • Distributed Tuning
      • Logging Tune Runs
      • Managing Resources
      • Working with Checkpoints
      • Using Search Spaces
      • Understanding PBT
      • Scalability Benchmarks
    • Examples
    • Ray Tune FAQ
    • Ray Tune API
      • Execution (tune.run, tune.Experiment)
      • Training (tune.Trainable, tune.report)
      • Search Space API
      • Search Algorithms (tune.suggest)
      • Trial Schedulers (tune.schedulers)
      • Stopping mechanisms (tune.stopper)
      • Analysis (tune.analysis)
      • Console Output (Reporters)
      • Loggers (tune.logger)
      • Environment variables
      • Scikit-Learn API (tune.sklearn)
      • External library integrations (tune.integration)
      • Tune Internals
      • Tune Client API
      • Tune CLI (Experimental)
  • Ray Serve
    • End-to-End Tutorial
    • Core API: Deployments
    • Calling Deployments via HTTP and Python
    • Serving ML Models
    • Deployment Graph
    • Deploying Ray Serve
    • Debugging & Monitoring
    • Performance Tuning
    • Serve Architecture
    • Advanced Tutorials
      • Keras and Tensorflow Tutorial
      • PyTorch Tutorial
      • Scikit-Learn Tutorial
      • Batching Tutorial
      • Integration with Existing Web Servers
      • Serving RLlib Models
      • Building a Gradio demo with Ray Serve
    • Ray Serve FAQ
    • Ray Serve API
  • Ray RLlib
    • Key Concepts
    • Training APIs
    • Environments
    • Algorithms
    • User Guides
      • Models, Preprocessors, and Action Distributions
      • How To Customize Policies
      • Sample Collections and Trajectory Views
      • Working With Offline Data
      • How To Contribute to RLlib
    • Examples
    • Ray RLlib API
      • Environments
        • BaseEnv API
        • MultiAgentEnv API
        • VectorEnv API
        • ExternalEnv API
      • Trainer API
      • Policies
        • Base Policy class (ray.rllib.policy.policy.Policy)
        • TensorFlow-Specific Sub-Classes
        • Torch-Specific Policy: TorchPolicy
        • Building Custom Policy Classes
      • Model APIs
      • Evaluation and Environment Rollout
        • RolloutWorker
        • Sample Batches
        • WorkerSet
        • Environment Samplers
        • PolicyMap (ray.rllib.policy.policy_map.PolicyMap)
      • Offline RL
      • Distributed Execution API
      • RLlib Utilities
        • Exploration API
        • Schedules API
        • RLlib Annotations/Decorators
        • Deep Learning Framework (tf vs torch) Utilities
        • TensorFlow Utility Functions
        • PyTorch Utility Functions
        • Numpy Utility Functions
        • Deprecation Tools/Utils
  • Ray Workflows
    • Workflow Basics
    • Workflow Management
    • Virtual Actors
    • Workflow Metadata
    • Events
    • API Comparisons
    • Advanced Topics
    • Ray Workflows API
  • More Ray ML Libraries
    • Ray AI Runtime (alpha)
    • Distributed Scikit-learn / Joblib
    • Distributed LightGBM on Ray
    • Distributed multiprocessing.Pool
    • Ray Collective Communication Lib
    • Distributed PyTorch Lightning Training on Ray
    • Using Ray with Pytorch Lightning
    • Distributed XGBoost on Ray
    • XGBoost-Ray with Dask
    • XGBoost-Ray with Modin

Ray Core

  • Getting Started
  • Key Concepts
  • User Guides
    • Tasks
      • Specifying Required Resources
      • GPU Support
      • Nested Remote Functions
      • Fault Tolerance
      • Scheduling
      • Task Design Patterns
        • Pattern: Tree of tasks
        • Pattern: Map and reduce
        • Pattern: Using ray.wait to limit the number of in-flight tasks
        • Antipattern: Closure capture of large / unserializable object
        • Antipattern: Too fine-grained tasks
        • Antipattern: Unnecessary call of ray.get in a task
        • Antipattern: Calling ray.get in a loop
        • Antipattern: Processing results in submission order using ray.get
        • Antipattern: Fetching too many results at once with ray.get
        • Antipattern: Redefining task or actor in loop
        • Antipattern: Accessing Global Variable in Tasks/Actors
    • Actors
      • Named Actors
      • Terminating Actors
      • AsyncIO / Concurrency for Actors
      • Limiting Concurrency Per-Method with Concurrency Groups
      • Utility Classes
      • Fault Tolerance
      • Scheduling
      • Actor Task Execution Orders
      • Actor Design Patterns
        • Pattern: Tree of actors
        • Pattern: Multi-node synchronization using an Actor
        • Pattern: Concurrent operations with async actor
        • Pattern: Overlapping computation and communication
        • Pattern: Fault Tolerance with Actor Checkpointing
    • Objects
      • Serialization
      • Memory Management
      • Object Spilling
      • Fault Tolerance
    • Placement Groups
    • Environment Dependencies
    • More Topics
      • Tips for first-time users
      • Starting Ray
      • Debugging and Profiling
      • Using Namespaces
      • Cross-Language Programming
      • Working with Jupyter Notebooks & JupyterLab
      • Miscellaneous Topics
  • Examples
    • Asynchronous Advantage Actor Critic (A3C)
    • Fault-Tolerant Fairseq Training
    • Simple Parallel Model Selection
    • Batch L-BFGS
    • Parameter Server
    • Learning to Play Pong
  • Ray Core API

Ray Clusters

  • Ray Cluster Quick Start
  • Deployment Guide
    • Ray Cluster Overview
    • Cluster Deployment Guide
    • Ray Job Submission
    • Ray Client: Interactive Development
  • Launching Cloud Clusters
    • AWS Configurations
  • Ray with Cluster Managers
    • Deploying on Kubernetes
    • Deploying with KubeRay (experimental)
    • Deploying on YARN
    • Deploying on Slurm
    • Deploying on LSF
  • Ray Cluster API
    • Ray Cluster Config YAML and CLI
      • Cluster YAML Configuration Options
      • Cluster Launcher Commands
      • Autoscaler SDK
    • Ray Job Submission API
  • Usage Stats Collection

References

  • API References
    • Ray Datasets API
    • Ray Train API
    • Ray Tune API
      • Execution (tune.run, tune.Experiment)
      • Training (tune.Trainable, tune.report)
      • Search Space API
      • Search Algorithms (tune.suggest)
      • Trial Schedulers (tune.schedulers)
      • Stopping mechanisms (tune.stopper)
      • Analysis (tune.analysis)
      • Console Output (Reporters)
      • Loggers (tune.logger)
      • Environment variables
      • Scikit-Learn API (tune.sklearn)
      • External library integrations (tune.integration)
      • Tune Internals
      • Tune Client API
      • Tune CLI (Experimental)
    • Ray Serve API
    • Ray RLlib API
      • Environments
        • BaseEnv API
        • MultiAgentEnv API
        • VectorEnv API
        • ExternalEnv API
      • Trainer API
      • Policies
        • Base Policy class (ray.rllib.policy.policy.Policy)
        • TensorFlow-Specific Sub-Classes
        • Torch-Specific Policy: TorchPolicy
        • Building Custom Policy Classes
      • Model APIs
      • Evaluation and Environment Rollout
        • RolloutWorker
        • Sample Batches
        • WorkerSet
        • Environment Samplers
        • PolicyMap (ray.rllib.policy.policy_map.PolicyMap)
      • Offline RL
      • Distributed Execution API
      • RLlib Utilities
        • Exploration API
        • Schedules API
        • RLlib Annotations/Decorators
        • Deep Learning Framework (tf vs torch) Utilities
        • TensorFlow Utility Functions
        • PyTorch Utility Functions
        • Numpy Utility Functions
        • Deprecation Tools/Utils
    • Ray Workflows API
    • Ray Core API
    • Ray Cluster Config YAML and CLI
      • Cluster YAML Configuration Options
      • Cluster Launcher Commands
      • Autoscaler SDK
    • Ray Job Submission API
    • Usage Stats Data API

Developer Guides

  • Getting Involved / Contributing
    • Building Ray from Source
    • Contributing to the Ray Documentation
    • Testing Autoscaling Locally
    • Tips for testing Ray programs
  • Configuring Ray
  • Observability
    • Ray Dashboard
    • Ray Debugger
    • Logging
    • Exporting Metrics
    • Tracing
    • Debugging (internal)
    • Profiling (internal)
  • Architecture Whitepaper
Theme by the Executable Book Project

Task Design PatternsΒΆ

This section is a collection of common design patterns (and anti-patterns) for Ray tasks. It is meant as a handbook for both:

  • New users trying to understand how to get started with Ray, and

  • Advanced users trying to optimize their use of Ray tasks

You may also be interested in visiting the design patterns section for actors.

  • Pattern: Tree of tasks
    • Example use case
    • Code example
  • Pattern: Map and reduce
    • Example use case
    • Code examples
  • Pattern: Using ray.wait to limit the number of in-flight tasks
    • Example use case
    • Code example
  • Antipattern: Closure capture of large / unserializable object
    • Code example
  • Antipattern: Too fine-grained tasks
    • Code example
  • Antipattern: Unnecessary call of ray.get in a task
    • Notes
    • Code example
  • Antipattern: Calling ray.get in a loop
    • Code example
  • Antipattern: Processing results in submission order using ray.get
    • Code example
  • Antipattern: Fetching too many results at once with ray.get
    • Code example
  • Antipattern: Redefining task or actor in loop
    • Code example
  • Antipattern: Accessing Global Variable in Tasks/Actors
    • Code example
    • Notes

previous

Scheduling

next

Pattern: Tree of tasks

By The Ray Team
© Copyright 2022, The Ray Team.