ray.data.preprocessors.MaxAbsScaler
ray.data.preprocessors.MaxAbsScaler#
- class ray.data.preprocessors.MaxAbsScaler(columns: List[str])[source]#
Bases:
ray.data.preprocessor.Preprocessor
Scale each column by its absolute max value.
The general formula is given by
\[x' = \frac{x}{\max{\vert x \vert}}\]where \(x\) is the column and \(x'\) is the transformed column. If \(\max{\vert x \vert} = 0\) (i.e., the column contains all zeros), then the column is unmodified.
Tip
This is the recommended way to scale sparse data. If you data isn’t sparse, you can use
MinMaxScaler
orStandardScaler
instead.Examples
>>> import pandas as pd >>> import ray >>> from ray.data.preprocessors import MaxAbsScaler >>> >>> df = pd.DataFrame({"X1": [-6, 3], "X2": [2, -4], "X3": [0, 0]}) # noqa: E501 >>> ds = ray.data.from_pandas(df) >>> ds.to_pandas() X1 X2 X3 0 -6 2 0 1 3 -4 0
Columns are scaled separately.
>>> preprocessor = MaxAbsScaler(columns=["X1", "X2"]) >>> preprocessor.fit_transform(ds).to_pandas() X1 X2 X3 0 -1.0 0.5 0 1 0.5 -1.0 0
Zero-valued columns aren’t scaled.
>>> preprocessor = MaxAbsScaler(columns=["X3"]) >>> preprocessor.fit_transform(ds).to_pandas() X1 X2 X3 0 -6 2 0.0 1 3 -4 0.0
- Parameters
columns – The columns to separately scale.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.