ray.data.Datasource
ray.data.Datasource#
- class ray.data.Datasource[source]#
Bases:
object
Interface for defining a custom
ray.data.Dataset
datasource.To read a datasource into a dataset, use
ray.data.read_datasource()
. To write to a writable datasource, useDataset.write_datasource()
.See
RangeDatasource
andDummyOutputDatasource
for examples of how to implement readable and writable datasources.Datasource instances must be serializable, since
create_reader()
andwrite()
are called in remote tasks.For an example of subclassing
Datasource
, read Implementing a Custom Datasource.PublicAPI: This API is stable across Ray releases.
Methods
__init__
()create_reader
(**read_args)Return a Reader for the given read arguments.
do_write
(blocks, metadata, ray_remote_args, ...)Launch Ray tasks for writing blocks out to the datasource.
get_name
()Return a human-readable name for this datasource.
on_write_complete
(write_results, **kwargs)Callback for when a write job completes.
on_write_failed
(write_results, error, **kwargs)Callback for when a write job fails.
prepare_read
(parallelism, **read_args)Deprecated: Please implement create_reader() instead.
write
(blocks, **write_args)Write blocks out to the datasource.