GroupedData API#

GroupedData objects are returned by groupby call: Dataset.groupby().

Constructor#

grouped_data.GroupedData(dataset, key)

Represents a grouped dataset created by calling Dataset.groupby().

Computations / Descriptive Stats#

grouped_data.GroupedData.count()

Compute count aggregation.

grouped_data.GroupedData.sum([on, ignore_nulls])

Compute grouped sum aggregation.

grouped_data.GroupedData.min([on, ignore_nulls])

Compute grouped min aggregation.

grouped_data.GroupedData.max([on, ignore_nulls])

Compute grouped max aggregation.

grouped_data.GroupedData.mean([on, ignore_nulls])

Compute grouped mean aggregation.

grouped_data.GroupedData.std([on, ddof, ...])

Compute grouped standard deviation aggregation.

Function Application#

grouped_data.GroupedData.aggregate(*aggs)

Implements an accumulator-based aggregation.

grouped_data.GroupedData.map_groups(fn, *[, ...])

Apply the given function to each group of records of this dataset.

Aggregate Function#

aggregate.AggregateFn(init, ...)

PublicAPI: This API is stable across Ray releases.

aggregate.Count()

Defines count aggregation.

aggregate.Sum([on, ignore_nulls, alias_name])

Defines sum aggregation.

aggregate.Max([on, ignore_nulls, alias_name])

Defines max aggregation.

aggregate.Mean([on, ignore_nulls, alias_name])

Defines mean aggregation.

aggregate.Std([on, ddof, ignore_nulls, ...])

Defines standard deviation aggregation.

aggregate.AbsMax([on, ignore_nulls, alias_name])

Defines absolute max aggregation.