ray.air.checkpoint.Checkpoint#

class ray.air.checkpoint.Checkpoint(local_path: Optional[Union[str, os.PathLike]] = None, data_dict: Optional[dict] = None, uri: Optional[str] = None)[source]#

Bases: object

Ray AIR Checkpoint.

An AIR Checkpoint are a common interface for accessing models across different AIR components and libraries. A Checkpoint can have its data represented in one of three ways:

  • as a directory on local (on-disk) storage

  • as a directory on an external storage (e.g., cloud storage)

  • as an in-memory dictionary

The Checkpoint object also has methods to translate between different checkpoint storage locations. These storage representations provide flexibility in distributed environments, where you may want to recreate an instance of the same model on multiple nodes or across different Ray clusters.

Example:

from ray.air.checkpoint import Checkpoint

# Create checkpoint data dict
checkpoint_data = {"data": 123}

# Create checkpoint object from data
checkpoint = Checkpoint.from_dict(checkpoint_data)

# Save checkpoint to a directory on the file system.
path = checkpoint.to_directory()

# This path can then be passed around,
# # e.g. to a different function or a different script.
# You can also use `checkpoint.to_uri/from_uri` to
# read from/write to cloud storage

# In another function or script, recover Checkpoint object from path
checkpoint = Checkpoint.from_directory(path)

# Convert into dictionary again
recovered_data = checkpoint.to_dict()

# It is guaranteed that the original data has been recovered
assert recovered_data == checkpoint_data

Checkpoints can be used to instantiate a Predictor, BatchPredictor, or PredictorDeployment class.

The constructor is a private API, instead the from_ methods should be used to create checkpoint objects (e.g. Checkpoint.from_directory()).

Other implementation notes:

When converting between different checkpoint formats, it is guaranteed that a full round trip of conversions (e.g. directory –> dict –> –> directory) will recover the original checkpoint data. There are no guarantees made about compatibility of intermediate representations.

New data can be added to a Checkpoint during conversion. Consider the following conversion: directory –> dict (adding dict[“foo”] = “bar”) –> directory –> dict (expect to see dict[“foo”] = “bar”). Note that the second directory will contain pickle files with the serialized additional field data in them.

Similarly with a dict as a source: dict –> directory (add file “foo.txt”) –> dict –> directory (will have “foo.txt” in it again). Note that the second dict representation will contain an extra field with the serialized additional files in it.

Checkpoints can be pickled and sent to remote processes. Please note that checkpoints pointing to local directories will be pickled as data representations, so the full checkpoint data will be contained in the checkpoint object. If you want to avoid this, consider passing only the checkpoint directory to the remote task and re-construct your checkpoint object in that function. Note that this will only work if the “remote” task is scheduled on the same node or a node that also has access to the local data path (e.g. on a shared file system like NFS).

If you need persistence across clusters, use the to_uri() or to_directory() methods to persist your checkpoints to disk.

PublicAPI (beta): This API is in beta and may change before becoming stable.

Methods

__init__([local_path, data_dict, uri])

DeveloperAPI: This API may change across minor Ray releases.

as_directory()

Return checkpoint directory path in a context.

from_bytes(data)

Create a checkpoint from the given byte string.

from_checkpoint(other)

Create a checkpoint from a generic Checkpoint.

from_dict(data)

Create checkpoint object from dictionary.

from_directory(path)

Create checkpoint object from directory.

from_uri(uri)

Create checkpoint object from location URI (e.g.

get_internal_representation()

Return tuple of (type, data) for the internal representation.

get_preprocessor()

Return the saved preprocessor, if one exists.

set_preprocessor(preprocessor)

Saves the provided preprocessor to this Checkpoint.

to_bytes()

Return Checkpoint serialized as bytes object.

to_dict()

Return checkpoint data as dictionary.

to_directory([path])

Write checkpoint data to directory.

to_uri(uri)

Write checkpoint data to location URI (e.g.

Attributes

uri

Return checkpoint URI, if available.