ray.data.DataContext#

class ray.data.DataContext(target_max_block_size: int, target_min_block_size: int, streaming_read_buffer_size: int, enable_pandas_block: bool, optimize_fuse_stages: bool, optimize_fuse_read_stages: bool, optimize_fuse_shuffle_stages: bool, optimize_reorder_stages: bool, actor_prefetcher_enabled: bool, use_push_based_shuffle: bool, pipeline_push_based_shuffle_reduce_tasks: bool, scheduling_strategy: Union[None, str, ray.util.scheduling_strategies.PlacementGroupSchedulingStrategy, ray.util.scheduling_strategies.NodeAffinitySchedulingStrategy, ray.util.scheduling_strategies.NodeLabelSchedulingStrategy], scheduling_strategy_large_args: Union[None, str, ray.util.scheduling_strategies.PlacementGroupSchedulingStrategy, ray.util.scheduling_strategies.NodeAffinitySchedulingStrategy, ray.util.scheduling_strategies.NodeLabelSchedulingStrategy], large_args_threshold: int, use_polars: bool, new_execution_backend: bool, use_streaming_executor: bool, eager_free: bool, decoding_size_estimation: bool, min_parallelism: bool, enable_tensor_extension_casting: bool, enable_auto_log_stats: bool, trace_allocations: bool, optimizer_enabled: bool, execution_options: ExecutionOptions, use_ray_tqdm: bool, use_legacy_iter_batches: bool, enable_progress_bars: bool, file_metadata_shuffler: str)[source]#

Bases: object

Singleton for shared Dataset resources and configurations.

This object is automatically propagated to workers and can be retrieved from the driver and remote workers via DataContext.get_current().

DeveloperAPI: This API may change across minor Ray releases.

Methods

__init__(target_max_block_size, ...)

Private constructor (use get_current() instead).

get_current()

Get or create a singleton context.