ray.data.Dataset.sort#

Dataset.sort(key: Optional[Union[str, List[str]]] = None, descending: Union[bool, List[bool]] = False) ray.data.dataset.Dataset[source]#

Sort the dataset by the specified key column or key function.

Note

The descending parameter must be a boolean, or a list of booleans. If it is a list, all items in the list must share the same direction. Multi-directional sort is not supported yet.

Examples

>>> import ray
>>> ds = ray.data.range(100)
>>> ds.sort("id", descending=True).take(3)
[{'id': 99}, {'id': 98}, {'id': 97}]

Time complexity: O(dataset size * log(dataset size / parallelism))

Parameters
  • key – The column or a list of columns to sort by.

  • descending – Whether to sort in descending order. Must be a boolean or a list of booleans matching the number of the columns.

Returns

A new, sorted Dataset.