GroupedDataset API#

GroupedDataset objects are returned by groupby call: Dataset.groupby().

Constructor#

grouped_dataset.GroupedDataset(dataset, key)

Represents a grouped dataset created by calling Dataset.groupby().

Computations / Descriptive Stats#

grouped_dataset.GroupedDataset.count()

Compute count aggregation.

grouped_dataset.GroupedDataset.sum([on, ...])

Compute grouped sum aggregation.

grouped_dataset.GroupedDataset.min([on, ...])

Compute grouped min aggregation.

grouped_dataset.GroupedDataset.max([on, ...])

Compute grouped max aggregation.

grouped_dataset.GroupedDataset.mean([on, ...])

Compute grouped mean aggregation.

grouped_dataset.GroupedDataset.std([on, ...])

Compute grouped standard deviation aggregation.

Function Application#

grouped_dataset.GroupedDataset.aggregate(*aggs)

Implements an accumulator-based aggregation.

grouped_dataset.GroupedDataset.map_groups(fn, *)

Apply the given function to each group of records of this dataset.

Aggregate Function#

aggregate.AggregateFn(init, ...)

PublicAPI: This API is stable across Ray releases.

aggregate.Count()

Defines count aggregation.

aggregate.Sum([on, ignore_nulls])

Defines sum aggregation.

aggregate.Max([on, ignore_nulls])

Defines max aggregation.

aggregate.Mean([on, ignore_nulls])

Defines mean aggregation.

aggregate.Std([on, ddof, ignore_nulls])

Defines standard deviation aggregation.

aggregate.AbsMax([on, ignore_nulls])

Defines absolute max aggregation.