ray.data.Dataset.to_pandas
ray.data.Dataset.to_pandas#
- Dataset.to_pandas(limit: int = 100000) pandas.DataFrame [source]#
Convert this dataset into a single Pandas DataFrame.
This is only supported for datasets convertible to Arrow or Pandas records. An error is raised if the number of records exceeds the provided limit. Note that you can use
.limit()
on the dataset beforehand to truncate the dataset manually.Note
This operation will trigger execution of the lazy transformations performed on this dataset.
Time complexity: O(dataset size)
- Parameters
limit – The maximum number of records to return. An error will be raised if the limit is exceeded.
- Returns
A Pandas DataFrame created from this dataset, containing a limited number of records.