User Guides#
This section explains how to use Ray’s key concepts to build distributed applications.
If you’re brand new to Ray, we recommend starting with the walkthrough.
- Tasks
- Actors
- Specifying required resources
- Calling the actor
- Passing Around Actor Handles
- Generators
- Cancelling Actor Tasks
- Scheduling
- Fault Tolerance
- FAQ: Actors, Workers and Resources
- Task Events
- More about Ray Actors
- Objects
- Environment Dependencies
- Concepts
- Preparing an environment using the Ray Cluster launcher
- Runtime environments
- Specifying a Runtime Environment Per-Job
- Specifying a Runtime Environment Per-Task or Per-Actor
- Common Workflows
- API Reference
- Frequently Asked Questions
- Are environments installed on every node?
- When is the environment installed?
- Where are the environments cached?
- How long does it take to install or to load from cache?
- What is the relationship between runtime environments and Docker?
- My
runtime_env
was installed, but when I log into the node I can’t import the packages.
- Remote URIs
- Hosting a Dependency on a Remote Git Provider: Step-by-Step Guide
- Debugging
- Scheduling
- Resources
- Scheduling Strategies
- Locality-Aware Scheduling
- More about Ray Scheduling
- Resources
- Accelerator Support
- Placement Groups
- Key Concepts
- Create a Placement Group (Reserve Resources)
- Schedule Tasks and Actors to Placement Groups (Use Reserved Resources)
- Placement Strategy
- Remove Placement Groups (Free Reserved Resources)
- Observe and Debug Placement Groups
- [Advanced] Child Tasks and Actors
- [Advanced] Named Placement Group
- [Advanced] Detached Placement Group
- [Advanced] Fault Tolerance
- API Reference
- Memory Management
- Out-Of-Memory Prevention
- Fault Tolerance
- Design Patterns & Anti-patterns
- Pattern: Using nested tasks to achieve nested parallelism
- Pattern: Using generators to reduce heap memory usage
- Pattern: Using ray.wait to limit the number of pending tasks
- Pattern: Using resources to limit the number of concurrently running tasks
- Pattern: Using asyncio to run actor methods concurrently
- Pattern: Using an actor to synchronize other tasks and actors
- Pattern: Using a supervisor actor to manage a tree of actors
- Pattern: Using pipelining to increase throughput
- Anti-pattern: Returning ray.put() ObjectRefs from a task harms performance and fault tolerance
- Anti-pattern: Calling ray.get in a loop harms parallelism
- Anti-pattern: Calling ray.get unnecessarily harms performance
- Anti-pattern: Processing results in submission order using ray.get increases runtime
- Anti-pattern: Fetching too many objects at once with ray.get causes failure
- Anti-pattern: Over-parallelizing with too fine-grained tasks harms speedup
- Anti-pattern: Redefining the same remote function or class harms performance
- Anti-pattern: Passing the same large argument by value repeatedly harms performance
- Anti-pattern: Closure capturing large objects harms performance
- Anti-pattern: Using global variables to share state between tasks and actors
- Advanced Topics