ray.data.grouped_dataset.GroupedDataset
ray.data.grouped_dataset.GroupedDataset#
- class ray.data.grouped_dataset.GroupedDataset(dataset: ray.data.dataset.Dataset[ray.data.block.T], key: Union[None, str, Callable[[ray.data.block.T], Any]])[source]#
Represents a grouped dataset created by calling
Dataset.groupby()
.The actual groupby is deferred until an aggregation is applied.
PublicAPI: This API is stable across Ray releases.
- __init__(dataset: ray.data.dataset.Dataset[ray.data.block.T], key: Union[None, str, Callable[[ray.data.block.T], Any]])[source]#
Construct a dataset grouped by key (internal API).
The constructor is not part of the GroupedDataset API. Use the
Dataset.groupby()
method to construct one.
Methods
__init__
(dataset, key)Construct a dataset grouped by key (internal API).
aggregate
(*aggs)Implements an accumulator-based aggregation.
count
()Compute count aggregation.
map_groups
(fn, *[, compute, batch_format])Apply the given function to each group of records of this dataset.
max
([on, ignore_nulls])Compute grouped max aggregation.
mean
([on, ignore_nulls])Compute grouped mean aggregation.
min
([on, ignore_nulls])Compute grouped min aggregation.
std
([on, ddof, ignore_nulls])Compute grouped standard deviation aggregation.
sum
([on, ignore_nulls])Compute grouped sum aggregation.