ray.tune.schedulers.PopulationBasedTrainingReplay#

class ray.tune.schedulers.PopulationBasedTrainingReplay(policy_file: str)[source]#

Bases: ray.tune.schedulers.trial_scheduler.FIFOScheduler

Replays a Population Based Training run.

Population Based Training does not return a single hyperparameter configuration, but rather a schedule of configurations. For instance, PBT might discover that a larger learning rate leads to good results in the first training iterations, but that a smaller learning rate is preferable later.

This scheduler enables replaying these parameter schedules from a finished PBT run. This requires that population based training has been run with log_config=True, which is the default setting.

The scheduler will only accept and train a single trial. It will start with the initial config of the existing trial and update the config according to the schedule.

Parameters

policy_file – The PBT policy file. Usually this is stored in ~/ray_results/experiment_name/pbt_policy_xxx.txt where xxx is the trial ID.

Example:

# Replaying a result from ray.tune.examples.pbt_convnet_example
from ray import air, tune

from ray.tune.examples.pbt_convnet_example import PytorchTrainable
from ray.tune.schedulers import PopulationBasedTrainingReplay

replay = PopulationBasedTrainingReplay(
    "~/ray_results/pbt_test/pbt_policy_XXXXX_00001.txt")

tuner = tune.Tuner(
    PytorchTrainable,
    run_config=air.RunConfig(
        stop={"training_iteration": 100}
    ),
    tune_config=tune.TuneConfig(
        scheduler=replay,
    ),
)
tuner.fit()

PublicAPI: This API is stable across Ray releases.

on_trial_add(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial)[source]#

Called when a new trial is added to the trial runner.

on_trial_result(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, result: Dict) str[source]#

Called on each intermediate result returned by a trial.

At this point, the trial scheduler can make a decision by returning one of CONTINUE, PAUSE, and STOP. This will only be called when the trial is in the RUNNING state.

debug_string() str[source]#

Returns a human readable message for printing to the console.