ray.rllib.core.rl_module.multi_rl_module.MultiRLModule#
- class ray.rllib.core.rl_module.multi_rl_module.MultiRLModule(config=-1, *, observation_space: gymnasium.Space | None = None, action_space: gymnasium.Space | None = None, inference_only: bool | None = None, learner_only: bool | None = None, model_config: dict | None = None, rl_module_specs: Dict[str, RLModuleSpec] | None = None, **kwargs)[source]#
Bases:
RLModule
Base class for an RLModule that contains n sub-RLModules.
This class holds a mapping from ModuleID to underlying RLModules. It provides a convenient way of accessing each individual module, as well as accessing all of them with only one API call. Whether a given module is trainable is determined by the caller of this class (not the instance of this class itself).
The extension of this class can include any arbitrary neural networks as part of the MultiRLModule. For example, a MultiRLModule can include a shared encoder network that is used by all the individual (single-agent) RLModules. It is up to the user to decide how to implement this class.
The default implementation assumes the data communicated as input and output of the APIs in this class are
Dict[ModuleID, Dict[str, Any]]
types. TheMultiRLModule
by default loops through eachmodule_id
, and runs the forward pass of the correspondingRLModule
object with the associatedbatch
within the input. It also assumes that the underlying RLModules do not share any parameters or communication with one another. The behavior of modules with such advanced communication would be undefined by default. To share parameters or communication between the underlying RLModules, you should implement your ownMultiRLModule
subclass.PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Methods
Initializes a MultiRLModule instance.
Adds a module at run time to the multi-agent module.
Returns self in order to match
RLModule.as_multi_rl_module()
behavior.Calls the given function with each (module_id, module).
DO NOT OVERRIDE! Forward-pass during exploration, called from the sampler.
DO NOT OVERRIDE! Forward-pass during evaluation, called from the sampler.
DO NOT OVERRIDE! Forward-pass during training called from the learner.
Creates a new Checkpointable instance from the given location and returns it.
Returns the module with the given module ID or default if not found in self.
Returns the action distribution class for this RLModule used for exploration.
Returns the action distribution class for this RLModule used for inference.
Returns JSON writable metadata further describing the implementing class.
Returns the action distribution class for this RLModule used for training.
Returns the input specs of the forward_exploration method.
Returns the input specs of the forward_inference method.
Returns the input specs of the forward_train method.
Returns an ItemsView over the module IDs in this MultiRLModule.
Returns a KeysView over the module IDs in this MultiRLModule.
Removes a module at runtime from the multi-agent module.
Restores the state of the implementing class from the given path.
Saves the state of the implementing class (or
state
) topath
.Sets the state of the multi-agent module.
Sets up the underlying, individual RLModules.
Returns the underlying module if this module is a wrapper.
Returns a ValuesView over the module IDs in this MultiRLModule.
Attributes