ray.tune.Tuner.restore#

classmethod Tuner.restore(path: str, trainable: Union[str, Callable, Type[ray.tune.trainable.trainable.Trainable], BaseTrainer], resume_unfinished: bool = True, resume_errored: bool = False, restart_errored: bool = False, param_space: Optional[Dict[str, Any]] = None, storage_filesystem: Optional[pyarrow._fs.FileSystem] = None) Tuner[source]#

Restores Tuner after a previously failed run.

All trials from the existing run will be added to the result table. The argument flags control how existing but unfinished or errored trials are resumed.

Finished trials are always added to the overview table. They will not be resumed.

Unfinished trials can be controlled with the resume_unfinished flag. If True (default), they will be continued. If False, they will be added as terminated trials (even if they were only created and never trained).

Errored trials can be controlled with the resume_errored and restart_errored flags. The former will resume errored trials from their latest checkpoints. The latter will restart errored trials from scratch and prevent loading their last checkpoints.

Parameters
  • path – The path where the previous failed run is checkpointed. This information could be easily located near the end of the console output of previous run. Note: depending on whether ray client mode is used or not, this path may or may not exist on your local machine.

  • trainable – The trainable to use upon resuming the experiment. This should be the same trainable that was used to initialize the original Tuner.

  • param_space – The same param_space that was passed to the original Tuner. This can be optionally re-specified due to the param_space potentially containing Ray object references (tuning over Datasets or tuning over several ray.put object references). Tune expects the `param_space` to be unmodified, and the only part that will be used during restore are the updated object references. Changing the hyperparameter search space then resuming is NOT supported by this API.

  • resume_unfinished – If True, will continue to run unfinished trials.

  • resume_errored – If True, will re-schedule errored trials and try to restore from their latest checkpoints.

  • restart_errored – If True, will re-schedule errored trials but force restarting them from scratch (no checkpoint will be loaded).