ray.data.datasource.Reader#

class ray.data.datasource.Reader[source]#

Bases: object

A bound read operation for a datasource.

This is a stateful class so that reads can be prepared in multiple stages. For example, it is useful for Datasets to know the in-memory size of the read prior to executing it.

PublicAPI: This API is stable across Ray releases.

Methods

__init__()

estimate_inmemory_data_size()

Return an estimate of the in-memory data size, or None if unknown.

get_read_tasks(parallelism)

Execute the read and return read tasks.