Note

Ray 2.10.0 introduces the alpha stage of RLlib’s “new API stack”. The Ray Team plans to transition algorithms, example scripts, and documentation to the new code base thereby incrementally replacing the “old API stack” (e.g., ModelV2, Policy, RolloutWorker) throughout the subsequent minor releases leading up to Ray 3.0.

Note, however, that so far only PPO (single- and multi-agent) and SAC (single-agent only) support the “new API stack” and continue to run by default with the old APIs. You can continue to use the existing custom (old stack) classes.

See for more details on how to use the new API stack.

Note

This doc is related to RLlib’s new API stack and therefore experimental.

RLModule API#

RL Module specifications and configurations#

Single Agent#

SingleAgentRLModuleSpec

Utility spec class to make constructing RLModules (in single-agent case) easier.

SingleAgentRLModuleSpec.build

Builds the RLModule from this spec.

SingleAgentRLModuleSpec.get_rl_module_config

Returns the RLModule config for this spec.

RLModule Configuration#

RLModuleConfig

A utility config class to make it constructing RLModules easier.

RLModuleConfig.to_dict

Returns a serialized representation of the config.

RLModuleConfig.from_dict

Creates a config from a serialized representation.

RLModuleConfig.get_catalog

Returns the catalog for this config.

Multi Agent#

MultiAgentRLModuleSpec

A utility spec class to make it constructing MARL modules easier.

MultiAgentRLModuleSpec.build

Builds either the multi-agent module or the single-agent module.

MultiAgentRLModuleSpec.get_marl_config

Returns the MultiAgentRLModuleConfig for this spec.

RL Module API#

Constructor#

RLModule

Base class for RLlib modules.

RLModule.as_multi_agent

Returns a multi-agent wrapper around this module.

Forward methods#

forward_train

Forward-pass during training called from the learner.

forward_exploration

Forward-pass during exploration, called from the sampler.

forward_inference

Forward-pass during evaluation, called from the sampler.

IO specifications#

input_specs_inference

Returns the input specs of the forward_inference method.

input_specs_exploration

Returns the input specs of the forward_exploration method.

input_specs_train

Returns the input specs of the forward_train method.

output_specs_inference

Returns the output specs of the forward_inference() method.

output_specs_exploration

Returns the output specs of the forward_exploration() method.

output_specs_train

Returns the output specs of the forward_train method.

Saving and Loading#

get_state

Returns the state dict of the module.

set_state

Sets the state dict of the module.

save_state

Saves the weights of this RLModule to the directory dir.

load_state

Loads the weights of an RLModule from the directory dir.

save_to_checkpoint

Saves the module to a checkpoint directory.

from_checkpoint

Loads the module from a checkpoint directory.

Multi Agent RL Module API#

Constructor#

MultiAgentRLModule

Base class for multi-agent RLModules.

MultiAgentRLModule.setup

Sets up the underlying RLModules.

MultiAgentRLModule.as_multi_agent

Returns a multi-agent wrapper around this module.

Modifying the underlying RL modules#

add_module

Adds a module at run time to the multi-agent module.

remove_module

Removes a module at run time from the multi-agent module.

Saving and Loading#

save_state

Saves the weights of this MultiAgentRLModule to dir.

load_state

Loads the weights of an MultiAgentRLModule from dir.