ray.data.Dataset.to_pandas#
- Dataset.to_pandas(limit: int = None) pandas.DataFrame [source]#
Convert this
Dataset
to a single pandas DataFrame.This method errors if the number of rows exceeds the provided
limit
. To truncate the dataset beforehand, calllimit()
.Examples
>>> import ray >>> ds = ray.data.from_items([{"a": i} for i in range(3)]) >>> ds.to_pandas() a 0 0 1 1 2 2
Note
This operation will trigger execution of the lazy transformations performed on this dataset.
Time complexity: O(dataset size)
- Parameters:
limit – The maximum number of rows to return. An error is raised if the dataset has more rows than this limit. Defaults to
None
, which means no limit.- Returns:
A pandas DataFrame created from this dataset, containing a limited number of rows.
- Raises:
ValueError – if the number of rows in the
Dataset
exceedslimit
.