ray.data.preprocessors.OneHotEncoder.fit_transform#

OneHotEncoder.fit_transform(ds: Dataset, *, transform_num_cpus: float | None = None, transform_memory: float | None = None, transform_batch_size: int | None = None, transform_concurrency: int | None = None) Dataset#

Fit this Preprocessor to the Dataset and then transform the Dataset.

Calling it more than once will overwrite all previously fitted state: preprocessor.fit_transform(A).fit_transform(B) is equivalent to preprocessor.fit_transform(B).

Parameters:
  • ds – Input Dataset.

  • transform_num_cpus – [experimental] The number of CPUs to reserve for each parallel map worker.

  • transform_memory – [experimental] The heap memory in bytes to reserve for each parallel map worker.

  • transform_batch_size – [experimental] The maximum number of rows to return.

  • transform_concurrency – [experimental] The maximum number of Ray workers to use concurrently.

Returns:

The transformed Dataset.

Return type:

ray.data.Dataset