ray.data.RandomSeedConfig#

class ray.data.RandomSeedConfig(seed: int | None = None, reseed_after_execution: bool = True, use_timestamp_as_default: bool = False)[source]#

This configuration object controls the random seed behavior for operations such as random_shuffle(), randomize_block_order(), and random_sample(). The random seed behavior is determined by the combination of the base seed seed and the reseed_after_execution parameter:

  • If seed is None, the random seed is always None (non-deterministic shuffling).

  • If seed is not None and reseed_after_execution is False, the base seed is used as the random seed for each execution.

  • If seed is not None and reseed_after_execution is True, the base seed is combined with the (incremental) execution index execution_idx to produce a different random seed tuple for each execution.

Note

Even if you provided a seed, you might still observe a non-deterministic row order. This is because tasks are executed in parallel and their completion order might vary. If you need to preserve the order of rows, set DataContext.get_current().execution_options.preserve_order.

Parameters:
  • seed – An optional integer base seed. If None, the operation is non-deterministic. If provided, the operation is deterministic based on the base seed and the reseed_after_execution parameter.

  • reseed_after_execution – If True, the random seed considers both seed and execution_idx, resulting in different shuffling orders across executions. If False, the base seed is used as the random seed for each execution, resulting in the same shuffling order across executions. Only takes effect when a base seed is provided. Defaults to True.

  • use_timestamp_as_default – When enabled, it supports a legacy behavior that relies on the timestamp as the default seed. This parameter is only used when the base seed is None. Defaults to False. See get_single_integer_random_seed for more details.

DeveloperAPI: This API may change across minor Ray releases.

Methods

create_seed_config

Create a RandomSeedConfig object from the seed argument in Ray Data public random APIs.

get_seed_tuple

Return a seed for random number generation.

Attributes

reseed_after_execution

seed

use_timestamp_as_default