ray.data.Dataset.sort
ray.data.Dataset.sort#
- Dataset.sort(key: Optional[str] = None, descending: bool = False) ray.data.dataset.Dataset [source]#
Sort the dataset by the specified key column or key function.
Examples
>>> import ray >>> # Sort by a single column in descending order. >>> ds = ray.data.from_items( ... [{"value": i} for i in range(1000)]) >>> ds.sort("value", descending=True) Sort +- Dataset(num_blocks=200, num_rows=1000, schema={value: int64})
Time complexity: O(dataset size * log(dataset size / parallelism))
- Parameters
key – The column to sort by. To sort by multiple columns, use a map function to generate the sort column beforehand.
descending – Whether to sort in descending order.
- Returns
A new, sorted dataset.