ray.data.datasource.PathPartitionEncoder#

class ray.data.datasource.PathPartitionEncoder(partitioning: ray.data.datasource.partitioning.Partitioning)[source]#

Callable that generates directory path strings for path-based partition formats.

Path-based partition formats embed all partition keys and values directly in their dataset file paths.

Two path partition formats are currently supported - HIVE and DIRECTORY.

For HIVE Partitioning, all partition directories will be generated using a “{key1}={value1}/{key2}={value2}” naming convention under the base directory. An accompanying ordered list of partition key field names must also be provided, where the order and length of all partition values must match the order and length of field names

For DIRECTORY Partitioning, all directories will be generated from partition values using a “{value1}/{value2}” naming convention under the base directory.

DeveloperAPI: This API may change across minor Ray releases.

__init__(partitioning: ray.data.datasource.partitioning.Partitioning)[source]#

Creates a new partition path encoder.

Parameters

partitioning – The path-based partition scheme. All partition paths will be generated under this scheme’s base directory. Field names are required for HIVE partition paths, optional for DIRECTORY partition paths. When non-empty, the order and length of partition key field names must match the order and length of partition values.

Methods

__init__(partitioning)

Creates a new partition path encoder.

of([style, base_dir, field_names, filesystem])

Creates a new partition path encoder.

Attributes

scheme

Returns the partitioning for this encoder.