ray.train.v2.api.data_parallel_trainer.DataParallelTrainer.fit#
- DataParallelTrainer.fit() Result[source]#
- Launches the Ray Train controller to run training on workers. - Returns:
- A Result object containing the training result. 
- Raises:
- ray.train.TrainingFailedError – This is a union of the ControllerError and WorkerGroupError. This returns a - ray.train.ControllerErrorif internal Ray Train controller logic encounters a non-retryable error or reaches the controller failure limit configured in- FailureConfig. This returns a- ray.train.WorkerGroupErrorif one or more workers fail during training and reaches the worker group failure limit configured in- FailureConfig(max_failures).