ray.data.from_items#

ray.data.from_items(items: List[Any], *, parallelism: int = -1, override_num_blocks: int | None = None) → MaterializedDataset[source]#

Create a Dataset from a list of local Python objects.

Use this method to create small datasets from data that fits in memory. The column name defaults to “item”.

Examples

>>> import ray
>>> ds = ray.data.from_items([1, 2, 3, 4, 5])
>>> ds
MaterializedDataset(num_blocks=..., num_rows=5, schema={item: int64})
>>> ds.schema()
Column  Type
------  ----
item    int64

Parameters:

items – List of local Python objects.
parallelism – This argument is deprecated. Use override_num_blocks argument.
override_num_blocks – Override the number of output blocks from all read tasks. By default, the number of output blocks is dynamically decided based on input data size and available resources. You shouldn’t manually set this value in most cases.

Returns:

A Dataset holding the items.