ray.data.grouped_data.GroupedData.min#

GroupedData.min(on: str | List[str] = None, ignore_nulls: bool = True) Dataset[source]#

Compute grouped min aggregation.

Examples

>>> import ray
>>> ray.data.le(100).groupby("value").min() 
>>> ray.data.from_items([ 
...     {"A": i % 3, "B": i, "C": i**2} 
...     for i in range(100)]) \ 
...     .groupby("A") \ 
...     .min(["B", "C"]) 
Parameters:
  • on – a column name or a list of column names to aggregate.

  • ignore_nulls – Whether to ignore null values. If True, null values will be ignored when computing the min; if False, if a null value is encountered, the output will be null. We consider np.nan, None, and pd.NaT to be null values. Default is True.

Returns:

The min result.

For different values of on, the return varies:

  • on=None: a dataset containing a groupby key column, "k", and a column-wise min column for each original column in the dataset.

  • on=["col_1", ..., "col_n"]: a dataset of n + 1 columns where the first column is the groupby key and the second through n + 1 columns are the results of the aggregations.

If groupby key is None then the key part of return is omitted.