logo

Ray 3.0.0.dev0

Overview

  • Getting Started Guide
  • Installing Ray
  • Ecosystem

Ray ML

  • Ray Data
    • Getting Started
    • Key Concepts
    • User Guides
      • Creating Datasets
      • Saving Datasets
      • Transforming Datasets
      • Accessing Datasets
      • Pipelining Compute
      • ML Preprocessing
      • Working with Tensors
      • Advanced Pipeline Usage
      • Random Data Access (Experimental)
      • Using Custom Datasources
      • Performance Tips and Tuning
    • Examples
      • Processing the NYC taxi dataset
      • Large-scale ML Ingest
    • FAQ
    • Ray Datasets API
    • Integrations
      • Using Dask on Ray
      • Using Spark on Ray (RayDP)
      • Using Mars on Ray
      • Using Pandas on Ray (Modin)
  • Ray Train
    • Ray Train User Guide
    • Ray Train Examples
    • Ray Train FAQ
    • Ray Train Architecture
    • Ray Train API
  • Ray Tune
    • Getting Started
    • Key Concepts
    • User Guides
      • How Tune Works
      • How to Stop and Resume
      • Using Callbacks and Metrics
      • Distributed Tuning
      • Logging Tune Runs
      • Managing Resources
      • Working with Checkpoints
      • Using Search Spaces
      • Understanding PBT
      • Scalability Benchmarks
    • Examples
      • Scikit-Learn Example
      • Keras Example
      • PyTorch Example
      • PyTorch Lightning Example
      • MXNet Example
      • Ray Serve Example
      • Ray RLlib Example
      • XGBoost Example
      • LightGBM Example
      • Horovod Example
      • Huggingface Example
      • Comet Example
      • Weights & Biases Example
      • Ax Example
      • Dragonfly Example
      • Skopt Example
      • HyperOpt Example
      • Bayesopt Example
      • FLAML Example
      • BOHB Example
      • Nevergrad Example
      • Optuna Example
      • ZOOpt Example
      • SigOpt Example
      • HEBO Example
    • Ray Tune FAQ
    • Ray Tune API
      • Execution (tune.run, tune.Experiment)
      • Training (tune.Trainable, tune.report)
      • Search Space API
      • Search Algorithms (tune.suggest)
      • Trial Schedulers (tune.schedulers)
      • Stopping mechanisms (tune.stopper)
      • Analysis (tune.analysis)
      • Console Output (Reporters)
      • Loggers (tune.logger)
      • Environment variables
      • Scikit-Learn API (tune.sklearn)
      • External library integrations (tune.integration)
      • Tune Internals
      • Tune Client API
      • Tune CLI (Experimental)
  • Ray Serve
    • Getting Started
    • Key Concepts
    • User Guides
      • Managing Deployments
      • Handling Dependencies
      • Calling Deployments via HTTP
      • HTTP Adapters
      • ServeHandle: Calling Deployments from Python
      • Serving ML Models
      • Deploying Ray Serve
      • Debugging & Monitoring
      • Performance Tuning
      • Deployment Graph
        • Deployment Graph E2E Tutorial
        • Pattern: Chain nodes with same class and different args
        • Pattern: Combine two nodes with passing same input in parallel
    • Serve Architecture
    • Examples
      • Keras and Tensorflow Tutorial
      • PyTorch Tutorial
      • Scikit-Learn Tutorial
      • Batching Tutorial
      • Integration with Existing Web Servers
      • Serving RLlib Models
      • Building a Gradio demo with Ray Serve
    • Ray Serve FAQ
    • Ray Serve API
  • Ray RLlib
    • Key Concepts
    • Training APIs
    • Environments
    • Algorithms
    • User Guides
      • Models, Preprocessors, and Action Distributions
      • How To Customize Policies
      • Sample Collections and Trajectory Views
      • Working With Offline Data
      • How To Contribute to RLlib
    • Examples
    • Ray RLlib API
      • Environments
        • BaseEnv API
        • MultiAgentEnv API
        • VectorEnv API
        • ExternalEnv API
      • Trainer API
      • Policies
        • Base Policy class (ray.rllib.policy.policy.Policy)
        • TensorFlow-Specific Sub-Classes
        • Torch-Specific Policy: TorchPolicy
        • Building Custom Policy Classes
      • Model APIs
      • Evaluation and Environment Rollout
        • RolloutWorker
        • Sample Batches
        • WorkerSet
        • Environment Samplers
        • PolicyMap (ray.rllib.policy.policy_map.PolicyMap)
      • Offline RL
      • Distributed Execution API
      • RLlib Utilities
        • Exploration API
        • Schedules API
        • RLlib Annotations/Decorators
        • Deep Learning Framework (tf vs torch) Utilities
        • TensorFlow Utility Functions
        • PyTorch Utility Functions
        • Numpy Utility Functions
        • Deprecation Tools/Utils
  • Ray Workflows
    • Workflow Basics
    • Workflow Management
    • Virtual Actors
    • Workflow Metadata
    • Events
    • API Comparisons
    • Advanced Topics
    • Ray Workflows API
  • More Ray ML Libraries
    • Ray AI Runtime (alpha)
      • Key Concepts
      • Deployment Guide
      • Setting up Data Ingest
      • Examples
        • Analyze hyperparameter tuning results
        • Serving reinforcement learning policy models
        • Logging results and uploading models to Comet ML
        • Logging results and uploading models to Weights & Biases
        • Training a model with distributed LightGBM
        • Online reinforcement learning with Ray AIR
        • Offline reinforcement learning with Ray AIR
        • Training a model with Sklearn
        • Training a model with distributed XGBoost
        • Fine-tune a 🤗 Transformers model
      • AIR API
    • Distributed Scikit-learn / Joblib
    • Distributed LightGBM on Ray
    • Distributed multiprocessing.Pool
    • Ray Collective Communication Lib
    • Distributed PyTorch Lightning Training on Ray
    • Using Ray with Pytorch Lightning
    • Distributed XGBoost on Ray
    • XGBoost-Ray with Dask
    • XGBoost-Ray with Modin

Ray Core

  • Getting Started
  • Key Concepts
  • User Guides
    • Tasks
      • Specifying Required Resources
      • GPU Support
      • Nested Remote Functions
      • Fault Tolerance
      • Scheduling
      • Task Design Patterns
        • Pattern: Tree of tasks
        • Pattern: Map and reduce
        • Pattern: Using ray.wait to limit the number of in-flight tasks
        • Antipattern: Closure capture of large / unserializable object
        • Antipattern: Too fine-grained tasks
        • Antipattern: Unnecessary call of ray.get in a task
        • Antipattern: Calling ray.get in a loop
        • Antipattern: Processing results in submission order using ray.get
        • Antipattern: Fetching too many results at once with ray.get
        • Antipattern: Redefining task or actor in loop
        • Antipattern: Accessing Global Variable in Tasks/Actors
    • Actors
      • Named Actors
      • Terminating Actors
      • AsyncIO / Concurrency for Actors
      • Limiting Concurrency Per-Method with Concurrency Groups
      • Utility Classes
      • Fault Tolerance
      • Scheduling
      • Out-of-band Communication
      • Actor Task Execution Orders
      • Actor Design Patterns
        • Pattern: Tree of actors
        • Pattern: Multi-node synchronization using an Actor
        • Pattern: Concurrent operations with async actor
        • Pattern: Overlapping computation and communication
        • Pattern: Fault Tolerance with Actor Checkpointing
    • Objects
      • Serialization
      • Memory Management
      • Object Spilling
      • Fault Tolerance
    • Placement Groups
    • Environment Dependencies
    • More Topics
      • Tips for first-time users
      • Starting Ray
      • Debugging and Profiling
      • Using Namespaces
      • Cross-Language Programming
      • Working with Jupyter Notebooks & JupyterLab
      • Miscellaneous Topics
  • Examples
    • Asynchronous Advantage Actor Critic (A3C)
    • Fault-Tolerant Fairseq Training
    • Simple Parallel Model Selection
    • Parameter Server
    • Learning to Play Pong
  • Ray Core API

Ray Clusters

  • Ray Cluster Quick Start
  • Deployment Guide
    • Ray Cluster Overview
    • Cluster Deployment Guide
    • Ray Job Submission
    • Ray Client: Interactive Development
  • Launching Cloud Clusters
    • AWS Configurations
  • Ray with Cluster Managers
    • Deploying on Kubernetes
    • Deploying with KubeRay (experimental)
    • Deploying on YARN
    • Deploying on Slurm
    • Deploying on LSF
  • Ray Cluster API
    • Ray Cluster Config YAML and CLI
      • Cluster YAML Configuration Options
      • Cluster Launcher Commands
      • Autoscaler SDK
    • Ray Job Submission API
  • Usage Stats Collection

References

  • API References
    • Ray Datasets API
    • Ray Train API
    • Ray Tune API
      • Execution (tune.run, tune.Experiment)
      • Training (tune.Trainable, tune.report)
      • Search Space API
      • Search Algorithms (tune.suggest)
      • Trial Schedulers (tune.schedulers)
      • Stopping mechanisms (tune.stopper)
      • Analysis (tune.analysis)
      • Console Output (Reporters)
      • Loggers (tune.logger)
      • Environment variables
      • Scikit-Learn API (tune.sklearn)
      • External library integrations (tune.integration)
      • Tune Internals
      • Tune Client API
      • Tune CLI (Experimental)
    • Ray Serve API
    • Ray RLlib API
      • Environments
        • BaseEnv API
        • MultiAgentEnv API
        • VectorEnv API
        • ExternalEnv API
      • Trainer API
      • Policies
        • Base Policy class (ray.rllib.policy.policy.Policy)
        • TensorFlow-Specific Sub-Classes
        • Torch-Specific Policy: TorchPolicy
        • Building Custom Policy Classes
      • Model APIs
      • Evaluation and Environment Rollout
        • RolloutWorker
        • Sample Batches
        • WorkerSet
        • Environment Samplers
        • PolicyMap (ray.rllib.policy.policy_map.PolicyMap)
      • Offline RL
      • Distributed Execution API
      • RLlib Utilities
        • Exploration API
        • Schedules API
        • RLlib Annotations/Decorators
        • Deep Learning Framework (tf vs torch) Utilities
        • TensorFlow Utility Functions
        • PyTorch Utility Functions
        • Numpy Utility Functions
        • Deprecation Tools/Utils
    • AIR API
    • Ray Workflows API
    • Ray Core API
    • Ray Cluster Config YAML and CLI
      • Cluster YAML Configuration Options
      • Cluster Launcher Commands
      • Autoscaler SDK
    • Ray Job Submission API
    • Usage Stats Data API

Developer Guides

  • Getting Involved / Contributing
    • Building Ray from Source
    • Contributing to the Ray Documentation
    • Testing Autoscaling Locally
    • Tips for testing Ray programs
  • Configuring Ray
  • Observability
    • Ray Dashboard
    • Ray Debugger
    • Logging
    • Exporting Metrics
    • Tracing
    • Debugging (internal)
    • Profiling (internal)
  • Architecture Whitepaper
Theme by the Executable Book Project
Contents
  • Tune Feature Guides

User Guides¶

Tip

We’d love to hear your feedback on using Tune - get in touch!

In this section, you can find material on how to use Tune and its various features. You can follow our Tune Feature Guides, but can also look into our Practical Examples, or go through some Exercises to get started.

Tune Feature Guides¶

img-top

How does Tune work?

img-top

A Guide To Stopping and Resuming Tune Experiments

img-top

Using Callbacks and Metrics in Tune

img-top

A Guide To Distributed Hyperparameter Tuning

img-top

How To Log Tune Runs

img-top

Using Resources (GPUs, Parallel & Distributed Runs)

img-top

Using Checkpoints For Your Experiments

img-top

A Guide To Working with Advanced Search Spaces

img-top

A simple guide to Population-based Training

img-top

Tune Scalability and Overhead Benchmarks

previous

Key Concepts

next

How does Tune work?

By The Ray Team
© Copyright 2022, The Ray Team.