Note
Ray 2.10.0 introduces the alpha stage of RLlib’s “new API stack”. The Ray Team plans to transition algorithms, example scripts, and documentation to the new code base thereby incrementally replacing the “old API stack” (e.g., ModelV2, Policy, RolloutWorker) throughout the subsequent minor releases leading up to Ray 3.0.
Note, however, that so far only PPO (single- and multi-agent) and SAC (single-agent only) support the “new API stack” and continue to run by default with the old APIs. You can continue to use the existing custom (old stack) classes.
See here for more details on how to use the new API stack.
Note
This doc is related to RLlib’s new API stack and therefore experimental.
RLModule API#
RL Module specifications and configurations#
Single Agent#
Utility spec class to make constructing RLModules (in single-agent case) easier. |
|
Builds the RLModule from this spec. |
|
Returns the RLModule config for this spec. |
RLModule Configuration#
A utility config class to make it constructing RLModules easier. |
|
Returns a serialized representation of the config. |
|
Creates a config from a serialized representation. |
|
Returns the catalog for this config. |
Multi RLModule (multi-agent)#
A utility spec class to make it constructing MultiRLModules easier. |
|
Builds either the multi-agent module or the single-agent module. |
|
Returns the MultiRLModuleConfig for this spec. |
RL Module API#
Constructor#
Base class for RLlib modules. |
|
Returns a multi-agent wrapper around this module. |
Forward methods#
Forward-pass during training called from the learner. |
|
Forward-pass during exploration, called from the sampler. |
|
Forward-pass during evaluation, called from the sampler. |
|
Forward-pass during training. |
|
Forward-pass during exploration. |
|
Forward-pass during evaluation. |
IO specifications#
Returns the input specs of the forward_inference method. |
|
Returns the input specs of the forward_exploration method. |
|
Returns the input specs of the forward_train method. |
|
Returns the output specs of the |
|
Returns the output specs of the |
|
Returns the output specs of the forward_train method. |
Saving and Loading#
Returns the state dict of the module. |
|
Sets the implementing class' state to the given state dict. |
|
Saves the state of the implementing class (or |
|
Restores the state of the implementing class from the given path. |
|
Creates a new Checkpointable instance from the given location and returns it. |
Multi Agent RL Module API#
Constructor#
Base class for an RLModule that contains n sub-RLModules. |
|
Sets up the underlying RLModules. |
|
Returns self in order to match |
Modifying the underlying RL modules#
Adds a module at run time to the multi-agent module. |
|
Removes a module at run time from the multi-agent module. |
Saving and Loading#
Saves the state of the implementing class (or |
|
Restores the state of the implementing class from the given path. |