ray.tune.schedulers.PopulationBasedTrainingReplay
ray.tune.schedulers.PopulationBasedTrainingReplay#
- class ray.tune.schedulers.PopulationBasedTrainingReplay(policy_file: str)[source]#
Bases:
ray.tune.schedulers.trial_scheduler.FIFOScheduler
Replays a Population Based Training run.
Population Based Training does not return a single hyperparameter configuration, but rather a schedule of configurations. For instance, PBT might discover that a larger learning rate leads to good results in the first training iterations, but that a smaller learning rate is preferable later.
This scheduler enables replaying these parameter schedules from a finished PBT run. This requires that population based training has been run with
log_config=True
, which is the default setting.The scheduler will only accept and train a single trial. It will start with the initial config of the existing trial and update the config according to the schedule.
- Parameters
policy_file – The PBT policy file. Usually this is stored in
~/ray_results/experiment_name/pbt_policy_xxx.txt
wherexxx
is the trial ID.
Example:
# Replaying a result from ray.tune.examples.pbt_convnet_example from ray import air, tune from ray.tune.examples.pbt_convnet_example import PytorchTrainable from ray.tune.schedulers import PopulationBasedTrainingReplay replay = PopulationBasedTrainingReplay( "~/ray_results/pbt_test/pbt_policy_XXXXX_00001.txt") tuner = tune.Tuner( PytorchTrainable, run_config=air.RunConfig( stop={"training_iteration": 100} ), tune_config=tune.TuneConfig( scheduler=replay, ), ) tuner.fit()
PublicAPI: This API is stable across Ray releases.
- on_trial_add(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial)[source]#
Called when a new trial is added to the trial runner.
- on_trial_result(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, result: Dict) str [source]#
Called on each intermediate result returned by a trial.
At this point, the trial scheduler can make a decision by returning one of CONTINUE, PAUSE, and STOP. This will only be called when the trial is in the RUNNING state.