ray.data.grouped_data.GroupedData#

class ray.data.grouped_data.GroupedData(dataset: ray.data.dataset.Dataset, key: str)[source]#

Bases: object

Represents a grouped dataset created by calling Dataset.groupby().

The actual groupby is deferred until an aggregation is applied.

PublicAPI: This API is stable across Ray releases.

Methods

__init__(dataset, key)

Construct a dataset grouped by key (internal API).

aggregate(*aggs)

Implements an accumulator-based aggregation.

count()

Compute count aggregation.

map_groups(fn, *[, compute, batch_format])

Apply the given function to each group of records of this dataset.

max([on, ignore_nulls])

Compute grouped max aggregation.

mean([on, ignore_nulls])

Compute grouped mean aggregation.

min([on, ignore_nulls])

Compute grouped min aggregation.

std([on, ddof, ignore_nulls])

Compute grouped standard deviation aggregation.

sum([on, ignore_nulls])

Compute grouped sum aggregation.