ray.data.Dataset.to_arrow_refs#
- Dataset.to_arrow_refs() List[ObjectRef[pyarrow.Table]] [source]#
Convert this
Dataset
into a distributed set of PyArrow tables.One PyArrow table is created for each block in this Dataset.
This method is only supported for datasets convertible to PyArrow tables. This function is zero-copy if the existing data is already in PyArrow format. Otherwise, the data is converted to PyArrow format.
Examples
>>> import ray >>> ds = ray.data.range(10, override_num_blocks=2) >>> refs = ds.to_arrow_refs() >>> len(refs) 2
Note
This operation will trigger execution of the lazy transformations performed on this dataset.
Time complexity: O(1) unless conversion is required.
- Returns:
A list of remote PyArrow tables created from this dataset.
DeveloperAPI: This API may change across minor Ray releases.