ray.data.from_pandas_refs#
- ray.data.from_pandas_refs(dfs: ObjectRef[pandas.DataFrame] | List[ObjectRef[pandas.DataFrame]]) MaterializedDataset[source]#
Create a
Datasetfrom a list of Ray object references to pandas dataframes.Examples
>>> import pandas as pd >>> import ray >>> df_ref = ray.put(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})) >>> ray.data.from_pandas_refs(df_ref) shape: (3, 2) ╭───────┬───────╮ │ a ┆ b │ │ --- ┆ --- │ │ int64 ┆ int64 │ ╞═══════╪═══════╡ │ 1 ┆ 4 │ │ 2 ┆ 5 │ │ 3 ┆ 6 │ ╰───────┴───────╯ (Showing 3 of 3 rows)
Create a Ray Dataset from a list of Pandas Dataframes references.
>>> ray.data.from_pandas_refs([df_ref, df_ref]) shape: (6, 2) ╭───────┬───────╮ │ a ┆ b │ │ --- ┆ --- │ │ int64 ┆ int64 │ ╞═══════╪═══════╡ │ 1 ┆ 4 │ │ 2 ┆ 5 │ │ 3 ┆ 6 │ │ 1 ┆ 4 │ │ 2 ┆ 5 │ │ 3 ┆ 6 │ ╰───────┴───────╯ (Showing 6 of 6 rows)
- Parameters:
dfs – A Ray object reference to a pandas dataframe, or a list of Ray object references to pandas dataframes.
- Returns:
Datasetholding data read from the dataframes.
DeveloperAPI: This API may change across minor Ray releases.