ray.data.datasource.PathPartitionEncoder#

class ray.data.datasource.PathPartitionEncoder(partitioning: ray.data.datasource.partitioning.Partitioning)[source]#

Bases: object

Callable that generates directory path strings for path-based partition formats.

Path-based partition formats embed all partition keys and values directly in their dataset file paths.

Two path partition formats are currently supported - HIVE and DIRECTORY.

For HIVE Partitioning, all partition directories will be generated using a “{key1}={value1}/{key2}={value2}” naming convention under the base directory. An accompanying ordered list of partition key field names must also be provided, where the order and length of all partition values must match the order and length of field names

For DIRECTORY Partitioning, all directories will be generated from partition values using a “{value1}/{value2}” naming convention under the base directory.

DeveloperAPI: This API may change across minor Ray releases.

Methods

__init__(partitioning)

Creates a new partition path encoder.

of([style, base_dir, field_names, filesystem])

Creates a new partition path encoder.

Attributes

scheme

Returns the partitioning for this encoder.