ray.rllib.algorithms.algorithm.Algorithm#
- class ray.rllib.algorithms.algorithm.Algorithm(config: AlgorithmConfig | None = None, env=None, logger_creator: Callable[[], Logger] | None = None, **kwargs)[source]#
Bases:
Checkpointable
,Trainable
,AlgorithmBase
An RLlib algorithm responsible for training one or more neural network models.
You can write your own Algorithm classes by sub-classing from
Algorithm
or any of its built-in subclasses. Override thetraining_step
method to implement your own algorithm logic. Find the various built-intraining_step()
methods for different algorithms in their respective [algo name].py files, for example:ray.rllib.algorithms.dqn.dqn.py
orray.rllib.algorithms.impala.impala.py
.The most important API methods a Algorithm exposes are
train()
,evaluate()
,save_to_path()
andrestore_from_path()
.Methods
Initializes an Algorithm instance.
Adds a new (single-agent) RLModule to this Algorithm's MARLModule.
Adds a new policy to this Algorithm.
Computes an action for the specified policy on the local Worker.
Computes an action for the specified policy on the local worker.
Evaluates current policy under
evaluation_config
settings.Exports model based on export_formats.
Exports Policy checkpoint to a local directory and returns an AIR Checkpoint.
Exports policy model with given policy_id to a local directory.
Creates a new algorithm instance from a given checkpoint.
Recovers an Algorithm from a state object.
Returns configuration passed in by Tune.
Returns a default Policy class to use, given a config.
Returns JSON writable metadata further describing the implementing class.
Returns the (single-agent) RLModule with
model_id
(None if ID not found).Return policy for the specified id, or None.
Return a dict mapping Module/Policy IDs to weights.
Merges a complete Algorithm config dict with a partial override dict.
Removes a new (single-agent) RLModule from this Algorithm's MARLModule.
Removes a policy from this Algorithm.
Resets trial for use with new config.
Resets configuration without restarting the trial.
Restores training state from a given model checkpoint.
Try bringing back unhealthy EnvRunners and - if successful - sync with local.
Saves the current model state to a checkpoint.
Exports checkpoint to a local directory.
Saves the state of the implementing class (or
state
) topath
.Set RLModule/Policy weights by Module/Policy ID.
Implements the main
Algorithm.train()
logic.Releases all resources used by this trainable.
Runs one logical iteration of training.
Runs multiple iterations of training.
Default single iteration logic of an algorithm.
Env validator function for this Algorithm class.
Attributes
The AlgorithmConfig instance of the Algorithm.
The local EnvRunner instance within the algo's EnvRunnerGroup.
The
EnvRunnerGroup
of the Algorithm.The local EnvRunner instance within the algo's evaluation EnvRunnerGroup.
A special EnvRunnerGroup only used for evaluation, not to collect training samples.
Current training iteration.
The
LearnerGroup
instance of the Algorithm, managing either one localLearner
or one or more remoteLearner
actors.Directory of the results and checkpoints for this Trainable.
The MetricsLogger instance of the Algorithm.
An optional OfflineData instance, used for offline RL.
Current training iteration (same as
self.iteration
).Trial ID for the corresponding trial of this Trainable.
Trial name for the corresponding trial of this Trainable.
Resources currently assigned to the trial of this Trainable.