ray.data.datasource.FilenameProvider.get_filename_for_block#

FilenameProvider.get_filename_for_block(block: pyarrow.Table | pandas.DataFrame, task_index: int, block_index: int) str[source]#

Generate a filename for a block of data.

Note

Filenames must be unique and deterministic for a given task and block index.

A block consists of multiple rows and corresponds to a single output file. Each task might produce a different number of blocks.

Parameters:
  • block – The block that will be written to a file.

  • task_index – The index of the the write task.

  • block_index – The index of the block within the write task.