ray.rllib.algorithms.algorithm.Algorithm#
- class ray.rllib.algorithms.algorithm.Algorithm(config: AlgorithmConfig | None = None, env=None, logger_creator: Callable[[], Logger] | None = None, **kwargs)[source]#
- Bases: - Checkpointable,- Trainable- An RLlib algorithm responsible for training one or more neural network models. - You can write your own Algorithm classes by sub-classing from - Algorithmor any of its built-in subclasses. Override the- training_stepmethod to implement your own algorithm logic. Find the various built-in- training_step()methods for different algorithms in their respective [algo name].py files, for example:- ray.rllib.algorithms.dqn.dqn.pyor- ray.rllib.algorithms.impala.impala.py.- The most important API methods an Algorithm exposes are - train()for running a single training iteration,- evaluate()for running a single round of evaluation,- save_to_path()for creating a checkpoint, and- restore_from_path()for loading a state from an existing checkpoint.- Methods - Initializes an Algorithm instance. - Adds a new (single-agent) RLModule to this Algorithm's MARLModule. - Adds a new policy to this Algorithm. - Evaluates current policy under - evaluation_configsettings.- Evaluates current policy offline under - evaluation_configsettings.- Exports model based on export_formats. - Exports Policy checkpoint to a local directory and returns an AIR Checkpoint. - Exports policy model with given policy_id to a local directory. - Creates a new algorithm instance from a given checkpoint. - Recovers an Algorithm from a state object. - Returns configuration passed in by Tune. - Returns a default Policy class to use, given a config. - Returns JSON writable metadata further describing the implementing class. - Returns the (single-agent) RLModule with - model_id(None if ID not found).- Return policy for the specified id, or None. - Return a dict mapping Module/Policy IDs to weights. - Merges a complete Algorithm config dict with a partial override dict. - Removes a new (single-agent) RLModule from this Algorithm's MARLModule. - Removes a policy from this Algorithm. - Resets trial for use with new config. - Resets configuration without restarting the trial. - Restores training state from a given model checkpoint. - Try bringing back unhealthy EnvRunners and - if successful - sync with local. - Saves the current model state to a checkpoint. - Exports checkpoint to a local directory. - Saves the state of the implementing class (or - state) to- path.- Set RLModule/Policy weights by Module/Policy ID. - Implements the main - Algorithm.train()logic.- Releases all resources used by this trainable. - Runs one logical iteration of training. - Runs multiple iterations of training. - Default single iteration logic of an algorithm. - Env validator function for this Algorithm class. - Attributes - The AlgorithmConfig instance of the Algorithm. - The local EnvRunner instance within the algo's EnvRunnerGroup. - The - EnvRunnerGroupof the Algorithm.- The local EnvRunner instance within the algo's evaluation EnvRunnerGroup. - A special EnvRunnerGroup only used for evaluation, not to collect training samples. - Current training iteration. - The - LearnerGroupinstance of the Algorithm, managing either one local- Learneror one or more remote- Learneractors.- Directory of the results and checkpoints for this Trainable. - The MetricsLogger instance of the Algorithm. - An optional OfflineData instance, used for offline RL. - Current training iteration (same as - self.iteration).- Trial ID for the corresponding trial of this Trainable. - Trial name for the corresponding trial of this Trainable. - Resources currently assigned to the trial of this Trainable.