ray.data.preprocessors.Categorizer.deserialize#
- static Categorizer.deserialize(serialized: str | bytes) Preprocessor#
Deserialize a preprocessor from serialized data.
⚠️ DO NOT OVERRIDE THIS METHOD IN SUBCLASSES ⚠️
This method is marked as
@finalin the concrete implementation and handles the complete deserialization orchestration. Subclasses should implement the abstract methods instead:_set_serializable_fields()and_set_stats().Deserialization Process:
Detects format from magic bytes in serialized data
Delegates to
SerializationHandlerFactoryfor format-specific parsingExtracts metadata (type, version, fields, stats)
Looks up preprocessor class from registry
Creates new instance and restores state via abstract methods
Returns fully reconstructed preprocessor instance
Format Detection:
The method automatically detects the serialization format: -
CPKL:→ CloudPickle format - Base64 string → Legacy Pickle formatError Handling:
Provides comprehensive error handling for: - Unknown serialization formats - Corrupted or invalid data - Missing preprocessor types - Version compatibility issues
- Parameters:
serialized – Serialized preprocessor data (bytes or str)
- Returns:
Reconstructed preprocessor instance
- Raises:
ValueError – If the serialized data is corrupted or format is unrecognized
UnknownPreprocessorError – If the preprocessor type is not registered
DeveloperAPI: This API may change across minor Ray releases.