ray.data.datasource.ParquetMetadataProvider.prefetch_file_metadata#

ParquetMetadataProvider.prefetch_file_metadata(fragments: List[pyarrow.dataset.ParquetFileFragment], **ray_remote_args) List[Any] | None[source]#

Pre-fetches file metadata for all Parquet file fragments in a single batch.

Subsets of the metadata returned will be provided as input to subsequent calls to _get_block_metadata() together with their corresponding Parquet file fragments.

Implementations that don’t support pre-fetching file metadata shouldn’t override this method.

Parameters:

fragments – The Parquet file fragments to fetch metadata for.

Returns:

Metadata resolved for each input file fragment, or None. Metadata must be returned in the same order as all input file fragments, such that metadata[i] always contains the metadata for fragments[i].