ray.data.range_table(n: int, *, parallelism: int = - 1) ray.data.dataset.Dataset[ray.data._internal.arrow_block.ArrowRow][source]#

Create a tabular dataset from a range of integers [0..n).


>>> import ray
>>> ds = ray.data.range_table(1000) 
>>> ds 
Dataset(num_blocks=200, num_rows=1000, schema={value: int64})
>>> ds.map(lambda r: {"v2": r["value"] * 2}).take(2) 
[ArrowRow({'v2': 0}), ArrowRow({'v2': 2})]

This is similar to range(), but uses Arrow tables to hold the integers in Arrow records. The dataset elements take the form {“value”: N}.

  • n – The upper bound of the range of integer records.

  • parallelism – The amount of parallelism to use for the dataset. Parallelism may be limited by the number of items.


Dataset holding the integers as Arrow records.

PublicAPI: This API is stable across Ray releases.