DataConfig.__init__(datasets_to_split: Literal['all'] | List[str] = 'all', execution_options: ExecutionOptions | None = None)[source]#

Construct a DataConfig.

  • datasets_to_split – Specifies which datasets should be split among workers. Can be set to “all” or a list of dataset names. Defaults to “all”, i.e. split all datasets.

  • execution_options – The execution options to pass to Ray Data. By default, the options will be optimized for data ingest. When overriding this, base your options off of DataConfig.default_ingest_options().