DatasetContext API

class ray.data.context.DatasetContext(block_owner: ray.actor.ActorHandle, block_splitting_enabled: bool, target_max_block_size: int, target_min_block_size: int, streaming_read_buffer_size: int, enable_pandas_block: bool, optimize_fuse_stages: bool, optimize_fuse_read_stages: bool, optimize_fuse_shuffle_stages: bool, optimize_reorder_stages: bool, actor_prefetcher_enabled: bool, use_push_based_shuffle: bool, pipeline_push_based_shuffle_reduce_tasks: bool, scheduling_strategy: Union[None, str, ray.util.scheduling_strategies.PlacementGroupSchedulingStrategy, ray.util.scheduling_strategies.NodeAffinitySchedulingStrategy], use_polars: bool, decoding_size_estimation: bool, min_parallelism: bool, enable_tensor_extension_casting: bool)[source]

Singleton for shared Dataset resources and configurations.

This object is automatically propagated to workers and can be retrieved from the driver and remote workers via DatasetContext.get_current().

DeveloperAPI: This API may change across minor Ray releases.

static get_current() ray.data.context.DatasetContext[source]

Get or create a singleton context.

If the context has not yet been created in this process, it will be initialized with default settings.