ray.rllib.core.rl_module.default_model_config.DefaultModelConfig#
- class ray.rllib.core.rl_module.default_model_config.DefaultModelConfig(fcnet_hiddens: ~typing.List[int] = <factory>, fcnet_activation: str = 'tanh', fcnet_kernel_initializer: str | ~typing.Callable | None = None, fcnet_kernel_initializer_kwargs: dict | None = None, fcnet_bias_initializer: str | ~typing.Callable | None = None, fcnet_bias_initializer_kwargs: dict | None = None, conv_filters: ~typing.List[~typing.Tuple[int, int | ~typing.Tuple[int, int], int | ~typing.Tuple[int, int]]] | None = None, conv_activation: str = 'relu', conv_kernel_initializer: str | ~typing.Callable | None = None, conv_kernel_initializer_kwargs: dict | None = None, conv_bias_initializer: str | ~typing.Callable | None = None, conv_bias_initializer_kwargs: dict | None = None, head_fcnet_hiddens: ~typing.List[int] = <factory>, head_fcnet_activation: str = 'relu', head_fcnet_kernel_initializer: str | ~typing.Callable | None = None, head_fcnet_kernel_initializer_kwargs: dict | None = None, head_fcnet_bias_initializer: str | ~typing.Callable | None = None, head_fcnet_bias_initializer_kwargs: dict | None = None, free_log_std: bool = False, log_std_clip_param: float = 20.0, vf_share_layers: bool = True, use_lstm: bool = False, max_seq_len: int = 20, lstm_cell_size: int = 256, lstm_use_prev_action: bool = False, lstm_use_prev_reward: bool = False, lstm_kernel_initializer: str | ~typing.Callable | None = None, lstm_kernel_initializer_kwargs: dict | None = None, lstm_bias_initializer: str | ~typing.Callable | None = None, lstm_bias_initializer_kwargs: dict | None = None)[source]#
Dataclass to configure all default RLlib RLModules.
Users should NOT use this class for configuring their own custom RLModules, but use a custom
model_config
dict with arbitrary (str) keys passed into theRLModuleSpec
used to define the custom RLModule. For example:import gymnasium as gym import numpy as np from ray.rllib.core.rl_module.rl_module import RLModuleSpec from ray.rllib.examples.rl_modules.classes.tiny_atari_cnn_rlm import ( TinyAtariCNN ) my_rl_module = RLModuleSpec( module_class=TinyAtariCNN, observation_space=gym.spaces.Box(-1.0, 1.0, (64, 64, 4), np.float32), action_space=gym.spaces.Discrete(7), # DreamerV3-style stack working on a 64x64, color or 4x-grayscale-stacked, # normalized image. model_config={ "conv_filters": [[16, 4, 2], [32, 4, 2], [64, 4, 2], [128, 4, 2]], }, ).build()
Only RLlib’s default RLModules (defined by the various algorithms) should use this dataclass. Pass an instance of it into your algorithm config like so:
from ray.rllib.algorithms.ppo import PPOConfig from ray.rllib.core.rl_module.default_model_config import DefaultModelConfig config = ( PPOConfig() .rl_module( model_config=DefaultModelConfig(fcnet_hiddens=[32, 32]), ) )
DeveloperAPI: This API may change across minor Ray releases.
Methods
Attributes
Activation function descriptor for the stack configured by
conv_filters
.Initializer function or class descriptor for the bias vectors in the stack configured by
conv_filters
.Kwargs passed into the initializer function defined through
conv_bias_initializer
.List of lists of format [num_out_channels, kernel, stride] defining a Conv2D stack if the input space is 2D.
Initializer function or class descriptor for the weight/kernel matrices in the stack configured by
conv_filters
.Kwargs passed into the initializer function defined through
conv_kernel_initializer
.Activation function descriptor for the stack configured by
fcnet_hiddens
.Initializer function or class descriptor for the bias vectors in the stack configured by
fcnet_hiddens
.Kwargs passed into the initializer function defined through
fcnet_bias_initializer
.Initializer function or class descriptor for the weight/kernel matrices in the stack configured by
fcnet_hiddens
.Kwargs passed into the initializer function defined through
fcnet_kernel_initializer
.If True, for DiagGaussian action distributions (or any other continuous control distribution), make the second half of the policy's outputs a "free" bias parameter, rather than state-/NN-dependent nodes.
Activation function descriptor for the stack configured by
head_fcnet_hiddens
.Initializer function or class descriptor for the bias vectors in the stack configured by
head_fcnet_hiddens
.Kwargs passed into the initializer function defined through
head_fcnet_bias_initializer
.Initializer function or class descriptor for the weight/kernel matrices in the stack configured by
head_fcnet_hiddens
.Kwargs passed into the initializer function defined through
head_fcnet_kernel_initializer
.Whether to clip the log(stddev) when using a DiagGaussian action distribution (or any other continuous control distribution).
Initializer function or class descriptor for the bias vectors in the stack configured by the LSTM layer.
Kwargs passed into the initializer function defined through
lstm_bias_initializer
.The size of the LSTM cell.
Initializer function or class descriptor for the weight/kernel matrices in the LSTM layer.
Kwargs passed into the initializer function defined through
lstm_kernel_initializer
.The maximum seq len for building the train batch for an LSTM model.
Whether to wrap the encoder component (defined by
fcnet_hiddens
orconv_filters
) with an LSTM.Whether encoder layers (defined by
fcnet_hiddens
orconv_filters
) should be shared between policy- and value function.List containing the sizes (number of nodes) of a fully connected (MLP) stack.
List containing the sizes (number of nodes) of a fully connected (MLP) head (ex.