ray.data.Dataset.aggregate
ray.data.Dataset.aggregate#
- Dataset.aggregate(*aggs: ray.data.aggregate._aggregate.AggregateFn) Union[Any, Dict[str, Any]] [source]#
Aggregate values using one or more functions.
Use this method to compute metrics like the product of a column.
Note
This operation will trigger execution of the lazy transformations performed on this dataset.
Examples
import ray from ray.data.aggregate import AggregateFn ds = ray.data.from_items([{"number": i} for i in range(1, 10)]) aggregation = AggregateFn( init=lambda column: 1, accumulate_row=lambda a, row: a * row["number"], merge = lambda a1, a2: a1 + a2, name="prod" ) print(ds.aggregate(aggregation))
{'prod': 45}
Time complexity: O(dataset size / parallelism)
- Parameters
*aggs –
Aggregations
to perform.- Returns
A
dict
where each each value is an aggregation for a given column.