ray.data.preprocessors.MinMaxScaler
ray.data.preprocessors.MinMaxScaler#
- class ray.data.preprocessors.MinMaxScaler(columns: List[str])[source]#
Bases:
ray.data.preprocessor.Preprocessor
Scale each column by its range.
The general formula is given by
\[x' = \frac{x - \min(x)}{\max{x} - \min{x}}\]where \(x\) is the column and \(x'\) is the transformed column. If \(\max{x} - \min{x} = 0\) (i.e., the column is constant-valued), then the transformed column will get filled with zeros.
Transformed values are always in the range \([0, 1]\).
Tip
This can be used as an alternative to
StandardScaler
.Examples
>>> import pandas as pd >>> import ray >>> from ray.data.preprocessors import MinMaxScaler >>> >>> df = pd.DataFrame({"X1": [-2, 0, 2], "X2": [-3, -3, 3], "X3": [1, 1, 1]}) # noqa: E501 >>> ds = ray.data.from_pandas(df) >>> ds.to_pandas() X1 X2 X3 0 -2 -3 1 1 0 -3 1 2 2 3 1
Columns are scaled separately.
>>> preprocessor = MinMaxScaler(columns=["X1", "X2"]) >>> preprocessor.fit_transform(ds).to_pandas() X1 X2 X3 0 0.0 0.0 1 1 0.5 0.0 1 2 1.0 1.0 1
Constant-valued columns get filled with zeros.
>>> preprocessor = MinMaxScaler(columns=["X3"]) >>> preprocessor.fit_transform(ds).to_pandas() X1 X2 X3 0 -2 -3 0.0 1 0 -3 0.0 2 2 3 0.0
- Parameters
columns – The columns to separately scale.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.