ray.data.Dataset.to_numpy_refs#
- Dataset.to_numpy_refs(*, column: str | None = None) List[ObjectRef[numpy.ndarray]] [source]#
Converts this
Dataset
into a distributed set of NumPy ndarrays or dictionary of NumPy ndarrays.This is only supported for datasets convertible to NumPy ndarrays. This function induces a copy of the data. For zero-copy access to the underlying data, consider using
Dataset.to_arrow_refs()
orDataset.iter_internal_ref_bundles()
.Examples
>>> import ray >>> ds = ray.data.range(10, override_num_blocks=2) >>> refs = ds.to_numpy_refs() >>> len(refs) 2
Time complexity: O(dataset size / parallelism)
- Parameters:
column – The name of the column to convert to numpy. If
None
, all columns are used. If multiple columns are specified, each returnedNone. (future represents a dict of ndarrays. Defaults to)
- Returns:
A list of remote NumPy ndarrays created from this dataset.
DeveloperAPI: This API may change across minor Ray releases.