ray.data.datasource.Partitioning#

class ray.data.datasource.Partitioning(style: ray.data.datasource.partitioning.PartitionStyle, base_dir: Optional[str] = None, field_names: Optional[List[str]] = None, filesystem: Optional[pyarrow.fs.FileSystem] = None)[source]#

Bases: object

Partition scheme used to describe path-based partitions.

Path-based partition formats embed all partition keys and values directly in their dataset file paths.

DeveloperAPI: This API may change across minor Ray releases.

Methods

Attributes

base_dir

"/"-delimited base directory that all partitioned paths should exist under (exclusive).

field_names

The partition key field names (i.e.

filesystem

Filesystem that will be used for partition path file I/O.

normalized_base_dir

Returns the base directory normalized for compatibility with a filesystem.

resolved_filesystem

Returns the filesystem resolved for compatibility with a base directory.

style

The partition style - may be either HIVE or DIRECTORY.