ray.data.aggregate.ZeroPercentage#
- class ray.data.aggregate.ZeroPercentage(on: str, ignore_nulls: bool = True, alias_name: str | None = None)[source]#
Bases:
AggregateFnV2
Calculates the percentage of zero values in a numeric column.
This aggregation computes the percentage of zero values in a numeric dataset column. It can optionally ignore null values when calculating the percentage. The result is a percentage value between 0.0 and 100.0, where 0.0 means no zero values and 100.0 means all non-null values are zero.
Example
import ray from ray.data.aggregate import ZeroPercentage # Create a dataset with some zero values ds = ray.data.from_items([ {"value": 0}, {"value": 1}, {"value": 0}, {"value": 3}, {"value": 0} ]) # Calculate zero value percentage result = ds.aggregate(ZeroPercentage(on="value")) # result: 60.0 (3 out of 5 values are zero) # With null values and ignore_nulls=True (default) ds = ray.data.from_items([ {"value": 0}, {"value": None}, {"value": 0}, {"value": 3}, {"value": 0} ]) result = ds.aggregate(ZeroPercentage(on="value", ignore_nulls=True)) # result: 75.0 (3 out of 4 non-null values are zero) # Using with groupby ds = ray.data.from_items([ {"group": "A", "value": 0}, {"group": "A", "value": 1}, {"group": "B", "value": 0}, {"group": "B", "value": 0} ]) result = ds.groupby("group").aggregate(ZeroPercentage(on="value")).take_all() # result: [{'group': 'A', 'zero_pct(value)': 50.0}, # {'group': 'B', 'zero_pct(value)': 100.0}]
- Parameters:
on – The name of the column to calculate zero value percentage on. Must be a numeric column.
ignore_nulls – Whether to ignore null values when calculating the percentage. If True (default), null values are excluded from both numerator and denominator. If False, null values are included in the denominator but not the numerator.
alias_name – Optional name for the resulting column. If not provided, defaults to “zero_pct({column_name})”.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Methods