ray.data.aggregate.Unique#
- class ray.data.aggregate.Unique(on: str | None = None, ignore_nulls: bool = True, alias_name: str | None = None, encode_lists: bool = False)[source]#
Bases:
AggregateFnV2[Set[Any],List[Any]]Defines unique aggregation.
Example
import ray from ray.data.aggregate import Unique ds = ray.data.range(100) ds = ds.add_column("group_key", lambda x: x % 3) # Calculating the unique values per group: result = ds.groupby("group_key").aggregate(Unique(on="id")).take_all() # result: [{'group_key': 0, 'unique(id)': ...}, # {'group_key': 1, 'unique(id)': ...}, # {'group_key': 2, 'unique(id)': ...}]
- Parameters:
on – The name of the column from which to collect unique values.
ignore_nulls – Whether to ignore null values when collecting unique items. Default is True (nulls are excluded).
alias_name – Optional name for the resulting column.
encode_lists – If
True, encode list elements. IfFalse, encode whole lists (i.e., the entire list is considered as a single object).Falseby default. Note that this is a top-level flatten (not a recursive flatten) operation.
Methods
Transforms the final accumulated state into the desired output.
Return the agg name (e.g., 'sum', 'mean', 'count').