Hyperparameter Tuning with Ray Tune#

Hyperparameter tuning with Ray Tune is natively supported with Ray Train.

../../_images/train-tuner.svg

The Tuner will take in a Trainer and execute multiple training runs, each with different hyperparameter configurations.#

Key Concepts#

There are a number of key concepts when doing hyperparameter optimization with a Tuner:

  • A set of hyperparameters you want to tune in a search space.

  • A search algorithm to effectively optimize your parameters and optionally use a scheduler to stop searches early and speed up your experiments.

  • The search space, search algorithm, scheduler, and Trainer are passed to a Tuner, which runs the hyperparameter tuning workload by evaluating multiple hyperparameters in parallel.

  • Each individual hyperparameter evaluation run is called a trial.

  • The Tuner returns its results as a ResultGrid.

Note

Tuners can also be used to launch hyperparameter tuning without using Ray Train. See the Ray Tune documentation for more guides and examples.

Basic usage#

You can take an existing Trainer and simply pass it into a Tuner.

import ray
from ray import tune
from ray.tune import Tuner
from ray.train.xgboost import XGBoostTrainer

dataset = ray.data.read_csv("s3://anonymous@air-example-data/breast_cancer.csv")

trainer = XGBoostTrainer(
    label_column="target",
    params={
        "objective": "binary:logistic",
        "eval_metric": ["logloss", "error"],
        "max_depth": 4,
    },
    datasets={"train": dataset},
)

# Create Tuner
tuner = Tuner(
    trainer,
    # Add some parameters to tune
    param_space={"params": {"max_depth": tune.choice([4, 5, 6])}},
    # Specify tuning behavior
    tune_config=tune.TuneConfig(metric="train-logloss", mode="min", num_samples=2),
)
# Run tuning job
tuner.fit()

How to configure a Tuner?#

There are two main configuration objects that can be passed into a Tuner: the TuneConfig and the RunConfig.

The TuneConfig contains tuning specific settings, including:

  • the tuning algorithm to use

  • the metric and mode to rank results

  • the amount of parallelism to use

Here are some common configurations for TuneConfig:

from ray.tune import TuneConfig
from ray.tune.search.bayesopt import BayesOptSearch

tune_config = TuneConfig(
    metric="loss",
    mode="min",
    max_concurrent_trials=10,
    num_samples=100,
    search_alg=BayesOptSearch(),
)

See the TuneConfig API reference for more details.

The RunConfig contains configurations that are more generic than tuning specific settings. This includes:

  • failure/retry configurations

  • verbosity levels

  • the name of the experiment

  • the logging directory

  • checkpoint configurations

  • custom callbacks

  • integration with cloud storage

Below we showcase some common configurations of RunConfig.

from ray.train import CheckpointConfig, RunConfig

run_config = RunConfig(
    name="MyExperiment",
    storage_path="s3://...",
    checkpoint_config=CheckpointConfig(checkpoint_frequency=2),
)

See the RunConfig API reference for more details.

Search Space configuration#

A Tuner takes in a param_space argument where you can define the search space from which hyperparameter configurations will be sampled.

Depending on the model and dataset, you may want to tune:

  • The training batch size

  • The learning rate for deep learning training (e.g., image classification)

  • The maximum depth for tree-based models (e.g., XGBoost)

You can use a Tuner to tune most arguments and configurations for Ray Train, including but not limited to:

Read more about Tune search spaces here.

Train - Tune gotchas#

There are a couple gotchas about parameter specification when using Tuners with Trainers:

  • By default, configuration dictionaries and config objects will be deep-merged.

  • Parameters that are duplicated in the Trainer and Tuner will be overwritten by the Tuner param_space.

  • Exception: all arguments of the RunConfig and TuneConfig are inherently un-tunable.

See Getting Data in and out of Tune for an example.

Advanced Tuning#

Tuners also offer the ability to tune over different data preprocessing steps and different training/validation datasets, as shown in the following snippet.

from ray.data.preprocessors import StandardScaler


def get_dataset():
    ds1 = ray.data.read_csv("s3://anonymous@air-example-data/breast_cancer.csv")
    prep_v1 = StandardScaler(["worst radius", "worst area"])
    ds1 = prep_v1.fit_transform(ds1)
    return ds1


def get_another_dataset():
    ds2 = ray.data.read_csv(
        "s3://anonymous@air-example-data/breast_cancer_with_categorical.csv"
    )
    prep_v2 = StandardScaler(["worst concavity", "worst smoothness"])
    ds2 = prep_v2.fit_transform(ds2)
    return ds2


dataset_1 = get_dataset()
dataset_2 = get_another_dataset()

tuner = tune.Tuner(
    trainer,
    param_space={
        "datasets": {
            "train": tune.grid_search([dataset_1, dataset_2]),
        }
        # Your other parameters go here
    },
)