ray.data.from_arrow#

ray.data.from_arrow(tables: pyarrow.Table | bytes | List[pyarrow.Table | bytes]) MaterializedDataset[source]#

Create a Dataset from a list of PyArrow tables.

Examples

>>> import pyarrow as pa
>>> import ray
>>> table = pa.table({"x": [1]})
>>> ray.data.from_arrow(table)
MaterializedDataset(num_blocks=1, num_rows=1, schema={x: int64})

Create a Ray Dataset from a list of PyArrow tables.

>>> ray.data.from_arrow([table, table])
MaterializedDataset(num_blocks=2, num_rows=2, schema={x: int64})
Parameters:

tables – A PyArrow table, or a list of PyArrow tables, or its streaming format in bytes.

Returns:

Dataset holding data from the PyArrow tables.