ray.data.Dataset.max
ray.data.Dataset.max#
- Dataset.max(on: Union[None, str, Callable[[ray.data.block.T], Any], List[Union[None, str, Callable[[ray.data.block.T], Any]]]] = None, ignore_nulls: bool = True) ray.data.block.U [source]#
Compute maximum over entire dataset.
Note
This operation will trigger execution of the lazy transformations performed on this dataset, and will block until execution completes.
Examples
>>> import ray >>> ray.data.range(100).max() 99 >>> ray.data.from_items([ ... (i, i**2) ... for i in range(100)]).max(lambda x: x[1]) 9801 >>> ray.data.range_table(100).max("value") 99 >>> ray.data.from_items([ ... {"A": i, "B": i**2} ... for i in range(100)]).max(["A", "B"]) {'max(A)': 99, 'max(B)': 9801}
- Parameters
on –
The data subset on which to compute the max.
For a simple dataset: it can be a callable or a list thereof, and the default is to return a scalar max of all rows.
For an Arrow dataset: it can be a column name or a list thereof, and the default is to return an
ArrowRow
containing the column-wise max of all columns.
ignore_nulls – Whether to ignore null values. If
True
, null values will be ignored when computing the max; ifFalse
, if a null value is encountered, the output will be None. We consider np.nan, None, and pd.NaT to be null values. Default isTrue
.
- Returns
The max result.
For a simple dataset, the output is:
on=None
: a scalar representing the max of all rows,on=callable
: a scalar representing the max of the outputs of the callable called on each row,on=[callable_1, ..., calalble_n]
: a tuple of(max_1, ..., max_n)
representing the max of the outputs of the corresponding callables called on each row.
For an Arrow dataset, the output is:
on=None
: anArrowRow
containing the column-wise max of all columns,on="col"
: a scalar representing the max of all items in column"col"
,on=["col_1", ..., "col_n"]
: an n-columnArrowRow
containing the column-wise max of the provided columns.
If the dataset is empty, all values are null, or any value is null AND
ignore_nulls
isFalse
, then the output will be None.