ray.data.from_items(items: List[Any], *, parallelism: int = -1) MaterializedDataset[source]#

Create a Dataset from a list of local Python objects.

Use this method to create small datasets from data that fits in memory.


>>> import ray
>>> ds = ray.data.from_items([1, 2, 3, 4, 5])
>>> ds
MaterializedDataset(num_blocks=..., num_rows=5, schema={item: int64})
>>> ds.schema()
Column  Type
------  ----
item    int64
  • items – List of local Python objects.

  • parallelism – The amount of parallelism to use for the dataset. Defaults to -1, which automatically determines the optimal parallelism for your configuration. You should not need to manually set this value in most cases. For details on how the parallelism is automatically determined and guidance on how to tune it, see Tuning read parallelism. Parallelism is upper bounded by len(items).


A Dataset holding the items.