ray.data.from_pandas_refs#

ray.data.from_pandas_refs(dfs: ObjectRef[pandas.DataFrame] | List[ObjectRef[pandas.DataFrame]]) MaterializedDataset[source]#

Create a Dataset from a list of Ray object references to pandas dataframes.

Examples

>>> import pandas as pd
>>> import ray
>>> df_ref = ray.put(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> ray.data.from_pandas_refs(df_ref)
MaterializedDataset(num_blocks=1, num_rows=3, schema={a: int64, b: int64})

Create a Ray Dataset from a list of Pandas Dataframes references.

>>> ray.data.from_pandas_refs([df_ref, df_ref])
MaterializedDataset(num_blocks=2, num_rows=6, schema={a: int64, b: int64})
Parameters:

dfs – A Ray object reference to a pandas dataframe, or a list of Ray object references to pandas dataframes.

Returns:

Dataset holding data read from the dataframes.

DeveloperAPI: This API may change across minor Ray releases.