ray.data.from_arrow#

ray.data.from_arrow(tables: Union[pyarrow.Table, bytes, List[Union[pyarrow.Table, bytes]]]) ray.data.dataset.MaterializedDataset[source]#

Create a Dataset from a list of PyArrow tables.

Examples

>>> import pyarrow as pa
>>> import ray
>>> table = pa.table({"x": [1]})
>>> ray.data.from_arrow(table)
MaterializedDataset(num_blocks=1, num_rows=1, schema={x: int64})

Create a Ray Dataset from a list of PyArrow tables.

>>> ray.data.from_arrow([table, table])
MaterializedDataset(num_blocks=2, num_rows=2, schema={x: int64})
Parameters

tables – A PyArrow table, or a list of PyArrow tables, or its streaming format in bytes.

Returns

Dataset holding data from the PyArrow tables.