ray.data.from_items(items: List[Any], *, parallelism: int = - 1, output_arrow_format: bool = False) ray.data.dataset.MaterializedDataset[source]#

Create a dataset from a list of local Python objects.


>>> import ray
>>> ds = ray.data.from_items([1, 2, 3, 4, 5]) 
>>> ds 
MaterializedDataset(num_blocks=5, num_rows=5, schema={item: int64})
>>> ds.take_batch(2) 
{"item": array([1, 2])}
  • items – List of local Python objects.

  • parallelism – The amount of parallelism to use for the dataset. Parallelism may be limited by the number of items.

  • output_arrow_format – If True, always return data in Arrow format, raising an error if this is not possible. Defaults to False.


MaterializedDataset holding the items.

PublicAPI: This API is stable across Ray releases.