ray.data.checkpoint.interfaces.CheckpointBackend#

class ray.data.checkpoint.interfaces.CheckpointBackend(value)[source]#

Bases: Enum

Supported backends for storing and reading checkpoint files.

Currently, only one type of backend is supported:

  • Batch-based backends: CLOUD_OBJECT_STORAGE and FILE_STORAGE.

Their differences are as follows:

  1. Writing checkpoints: Batch-based backends write a checkpoint file for each block.

  2. Loading checkpoints and filtering input data: Batch-based backends load all checkpoint data into memory prior to dataset execution. The checkpoint data is then passed to each read task to perform filtering.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

CLOUD_OBJECT_STORAGE = 'CLOUD_OBJECT_STORAGE'#

Batch-based checkpoint backend that uses cloud object storage, such as AWS S3, Google Cloud Storage, etc.

FILE_STORAGE = 'FILE_STORAGE'#

Batch based checkpoint backend that uses file system storage. Note, when using this backend, the checkpoint path must be a network-mounted file system (e.g. /mnt/cluster_storage/).