ray.data.DatasetPipeline.rewindow#

DatasetPipeline.rewindow(*, blocks_per_window: int, preserve_epoch: bool = True) ray.data.dataset_pipeline.DatasetPipeline[ray.data.block.T][source]#

Change the windowing (blocks per dataset) of this pipeline.

Changes the windowing of this pipeline to the specified size. For example, if the current pipeline has two blocks per dataset, and rewindow(blocks_per_window=4) is requested, adjacent datasets will be merged until each dataset is 4 blocks. If rewindow(blocks_per_window) was requested the datasets will be split into smaller windows.

Parameters
  • blocks_per_window – The new target blocks per window.

  • preserve_epoch – Whether to preserve epoch boundaries. If set to False, then windows can contain data from two adjacent epochs.