ray.air.util.check_ingest.DummyTrainer#

class ray.air.util.check_ingest.DummyTrainer(*args, **kwargs)[source]#

Bases: ray.train.data_parallel_trainer.DataParallelTrainer

A Trainer that does nothing except read the data for a given number of epochs.

It prints out as much debugging statistics as possible.

This is useful for debugging data ingest problem. This trainer supports normal scaling options same as any other Trainer (e.g., num_workers, use_gpu).

DeveloperAPI: This API may change across minor Ray releases.

preprocess_datasets()[source]#

Called during fit() to preprocess dataset attributes with preprocessor.

Note

This method is run on a remote process.

This method is called prior to entering the training_loop.

If the Trainer has both a datasets dict and a preprocessor, the datasets dict contains a training dataset (denoted by the “train” key), and the preprocessor has not yet been fit, then it will be fit on the train dataset.

Then, all Trainer’s datasets will be transformed by the preprocessor.

The transformed datasets will be set back in the self.datasets attribute of the Trainer to be used when overriding training_loop.

static make_train_loop(num_epochs: int, prefetch_blocks: int, batch_size: Optional[int])[source]#

Make a debug train loop that runs for the given amount of epochs.