ray.data.preprocessors.Chain
ray.data.preprocessors.Chain#
- class ray.data.preprocessors.Chain(*preprocessors: ray.data.preprocessor.Preprocessor)[source]#
Bases:
ray.data.preprocessor.Preprocessor
Combine multiple preprocessors into a single
Preprocessor
.When you call
fit
, each preprocessor is fit on the dataset produced by the preceeding preprocessor’sfit_transform
.Example
>>> import pandas as pd >>> import ray >>> from ray.data.preprocessors import * >>> >>> df = pd.DataFrame({ ... "X0": [0, 1, 2], ... "X1": [3, 4, 5], ... "Y": ["orange", "blue", "orange"], ... }) >>> ds = ray.data.from_pandas(df) >>> >>> preprocessor = Chain( ... StandardScaler(columns=["X0", "X1"]), ... Concatenator(include=["X0", "X1"], output_column_name="X"), ... LabelEncoder(label_column="Y") ... ) >>> preprocessor.fit_transform(ds).to_pandas() Y X 0 1 [-1.224744871391589, -1.224744871391589] 1 0 [0.0, 0.0] 2 1 [1.224744871391589, 1.224744871391589]
- Parameters
preprocessors – The preprocessors to sequentially compose.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- fit_transform(ds: ray.data.dataset.Dataset) ray.data.dataset.Dataset [source]#
Fit this Preprocessor to the Dataset and then transform the Dataset.
Calling it more than once will overwrite all previously fitted state:
preprocessor.fit_transform(A).fit_transform(B)
is equivalent topreprocessor.fit_transform(B)
.- Parameters
dataset – Input Dataset.
- Returns
The transformed Dataset.
- Return type