ray.data.Dataset.materialize#

Dataset.materialize() MaterializedDataset[source]#

Execute and materialize this dataset into object store memory.

Note

This operation will trigger execution of the lazy transformations performed on this dataset.

This can be used to read all blocks into memory. By default, Dataset doesn’t read blocks from the datasource until the first transform.

Note that this does not mutate the original Dataset. Only the blocks of the returned MaterializedDataset class are pinned in memory.

Examples

>>> import ray
>>> ds = ray.data.range(10)
>>> materialized_ds = ds.materialize()
>>> materialized_ds
MaterializedDataset(num_blocks=..., num_rows=10, schema={id: int64})
Returns:

A MaterializedDataset holding the materialized data blocks.