ray.data.Datasource#

class ray.data.Datasource[source]#

Interface for defining a custom Dataset datasource.

To read a datasource into a dataset, use read_datasource().

Methods

__init__

create_reader

Deprecated: Implement Datasource.get_read_tasks() and Datasource.estimate_inmemory_data_size() instead.

estimate_inmemory_data_size

Return an estimate of the in-memory data size, or None if unknown.

get_name

Return a human-readable name for this datasource.

get_read_tasks

Execute the read and return read tasks.

prepare_read

Deprecated: Implement Datasource.get_read_tasks() and Datasource.estimate_inmemory_data_size() instead.

Attributes

should_create_reader

supports_distributed_reads

If False, only launch read tasks on the driver's node.