ray.train.DataConfig.__init__#

DataConfig.__init__(datasets_to_split: Literal['all'] | List[str] = 'all', execution_options: ExecutionOptions | None = None, enable_shard_locality: bool = True)[source]#

Construct a DataConfig.

Parameters:
  • datasets_to_split – Specifies which datasets should be split among workers. Can be set to “all” or a list of dataset names. Defaults to “all”, i.e. split all datasets.

  • execution_options – The execution options to pass to Ray Data. By default, the options will be optimized for data ingest. When overriding this, base your options off of DataConfig.default_ingest_options().

  • enable_shard_locality – If true, when sharding the datasets across Train workers, locality will be considered to minimize cross-node data transfer. This is on by default.