ray.data.from_pandas_refs#

ray.data.from_pandas_refs(dfs: ObjectRef[pandas.DataFrame] | List[ObjectRef[pandas.DataFrame]]) MaterializedDataset[source]#

Create a Dataset from a list of Ray object references to pandas dataframes.

Examples

>>> import pandas as pd
>>> import ray
>>> df_ref = ray.put(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> ray.data.from_pandas_refs(df_ref)  
shape: (3, 2)
╭───────┬───────╮
│ a     ┆ b     │
│ ---   ┆ ---   │
│ int64 ┆ int64 │
╞═══════╪═══════╡
│ 1     ┆ 4     │
│ 2     ┆ 5     │
│ 3     ┆ 6     │
╰───────┴───────╯
(Showing 3 of 3 rows)

Create a Ray Dataset from a list of Pandas Dataframes references.

>>> ray.data.from_pandas_refs([df_ref, df_ref])  
shape: (6, 2)
╭───────┬───────╮
│ a     ┆ b     │
│ ---   ┆ ---   │
│ int64 ┆ int64 │
╞═══════╪═══════╡
│ 1     ┆ 4     │
│ 2     ┆ 5     │
│ 3     ┆ 6     │
│ 1     ┆ 4     │
│ 2     ┆ 5     │
│ 3     ┆ 6     │
╰───────┴───────╯
(Showing 6 of 6 rows)
Parameters:

dfs – A Ray object reference to a pandas dataframe, or a list of Ray object references to pandas dataframes.

Returns:

Dataset holding data read from the dataframes.

DeveloperAPI: This API may change across minor Ray releases.