ray.data.Dataset.to_numpy_refs
ray.data.Dataset.to_numpy_refs#
- Dataset.to_numpy_refs(*, column: Optional[str] = None) List[ray.types.ObjectRef[numpy.ndarray]] [source]#
Converts this
Dataset
into a distributed set of NumPy ndarrays or dictionary of NumPy ndarrays.This is only supported for datasets convertible to NumPy ndarrays. This function induces a copy of the data. For zero-copy access to the underlying data, consider using
Dataset.to_arrow()
orDataset.get_internal_block_refs()
.Examples
>>> import ray >>> ds = ray.data.range(10, parallelism=2) >>> refs = ds.to_numpy_refs() >>> len(refs) 2
Time complexity: O(dataset size / parallelism)
- Parameters
column – The name of the column to convert to numpy. If
None
, all columns are used. If multiple columns are specified, each returnedNone. (future represents a dict of ndarrays. Defaults to) –
- Returns
A list of remote NumPy ndarrays created from this dataset.
DeveloperAPI: This API may change across minor Ray releases.