ray.train.torch.TorchTrainer.fit#
- TorchTrainer.fit() Result #
Launches the Ray Train controller to run training on workers.
- Returns:
A Result object containing the training result.
- Raises:
ray.train.TrainingFailedError – This is a union of the ControllerError and WorkerGroupError. This returns a
ray.train.ControllerError
if internal Ray Train controller logic encounters a non-retryable error or reaches the controller failure limit configured inFailureConfig
. This returns aray.train.WorkerGroupError
if one or more workers fail during training and reaches the worker group failure limit configured inFailureConfig(max_failures)
.