ray.train.data_parallel_trainer.DataParallelTrainer.restore#

classmethod DataParallelTrainer.restore(path: str, train_loop_per_worker: Callable[[], None] | Callable[[Dict], None] | None = None, train_loop_config: Dict | None = None, **kwargs)[source]#

Restores a DataParallelTrainer from a previously interrupted/failed run.

Parameters:

train_loop_per_worker – Optionally re-specified train loop function. This should be used to re-specify a function that is not restorable in a new Ray cluster (e.g., it holds onto outdated object references). This should be the same training loop that was passed to the original trainer constructor.
train_loop_config – Optionally re-specified train config. This should similarly be used if the original train_loop_config contained outdated object references, and it should not be modified from what was originally passed in.

See BaseTrainer.restore() for descriptions of the other arguments.

Returns a restored instance of the DataParallelTrainer.

Warning

DEPRECATED: This API is deprecated and may be removed in future Ray releases. The restore and can_restore APIs are deprecated and will be removed in a future release. See this issue for more context and migration options: ray-project/ray#49454. Disable these warnings by setting the environment variable: RAY_TRAIN_ENABLE_V2_MIGRATION_WARNINGS=0