ray.data.FileShuffleConfig#
- class ray.data.FileShuffleConfig(seed: int | None = None)[source]#
Configuration for file shuffling.
This configuration object controls how files are shuffled while reading file-based datasets.
Note
Even if you provided a seed, you might still observe a non-deterministic row order. This is because tasks are executed in parallel and their completion order might vary. If you need to preserve the order of rows, set
DataContext.get_current().execution_options.preserve_order
.- Parameters:
seed – An optional integer seed for the file shuffler. If provided, Ray Data shuffles files deterministically based on this seed.
Example
>>> import ray >>> from ray.data import FileShuffleConfig >>> shuffle = FileShuffleConfig(seed=42) >>> ds = ray.data.read_images("s3://anonymous@ray-example-data/batoidea", shuffle=shuffle)
DeveloperAPI: This API may change across minor Ray releases.
Methods
Attributes