ray.data.preprocessors.PowerTransformer#
- class ray.data.preprocessors.PowerTransformer(columns: List[str], power: float, method: str = 'yeo-johnson')[source]#
Bases:
Preprocessor
Apply a power transform to make your data more normally distributed.
Some models expect data to be normally distributed. By making your data more Gaussian-like, you might be able to improve your model’s performance.
This preprocessor supports the following transformations:
Box-Cox requires all data to be positive.
Warning
You need to manually specify the transform’s power parameter. If you choose a bad value, the transformation might not work well.
- Parameters:
columns – The columns to separately transform.
power – A parameter that determines how your data is transformed. Practioners typically set
power
between \(-2.5\) and \(2.5\), although you may need to try different values to find one that works well.method – A string representing which transformation to apply. Supports
"yeo-johnson"
and"box-cox"
. If you choose"box-cox"
, your data needs to be positive. Defaults to"yeo-johnson"
.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Methods
Load the original preprocessor serialized via
self.serialize()
.Fit this Preprocessor to the Dataset.
Fit this Preprocessor to the Dataset and then transform the Dataset.
Batch format hint for upstream producers to try yielding best block format.
Return this preprocessor serialized as a string.
Transform the given dataset.
Transform a single batch of data.