ray.data.from_arrow_refs#

ray.data.from_arrow_refs(tables: ObjectRef[pyarrow.Table | bytes] | List[ObjectRef[pyarrow.Table | bytes]]) MaterializedDataset[source]#

Create a Dataset from a list of Ray object references to PyArrow tables.

Examples

>>> import pyarrow as pa
>>> import ray
>>> table_ref = ray.put(pa.table({"x": [1]}))
>>> ray.data.from_arrow_refs(table_ref)  
shape: (1, 1)
╭───────╮
│ x     │
│ ---   │
│ int64 │
╞═══════╡
│ 1     │
╰───────╯
(Showing 1 of 1 rows)

Create a Ray Dataset from a list of PyArrow table references

>>> ray.data.from_arrow_refs([table_ref, table_ref])  
shape: (2, 1)
╭───────╮
│ x     │
│ ---   │
│ int64 │
╞═══════╡
│ 1     │
│ 1     │
╰───────╯
(Showing 2 of 2 rows)
Parameters:

tables – A Ray object reference to Arrow table, or list of Ray object references to Arrow tables, or its streaming format in bytes.

Returns:

Dataset holding data read from the tables.

DeveloperAPI: This API may change across minor Ray releases.