class ray.rllib.utils.exploration.parameter_noise.ParameterNoise(action_space, *, framework: str, policy_config: dict, model: ModelV2, initial_stddev: float = 1.0, random_timesteps: int = 10000, sub_exploration: dict | None = None, **kwargs)[source]#

Bases: Exploration

An exploration that changes a Model’s parameters.

Implemented based on: [1] https://openai.com/research/better-exploration-with-parameter-noise [2] https://arxiv.org/pdf/1706.01905.pdf

At the beginning of an episode, Gaussian noise is added to all weights of the model. At the end of the episode, the noise is undone and an action diff (pi-delta) is calculated, from which we determine the changes in the noise’s stddev for the next episode.



Initializes a ParameterNoise Exploration object.


May add optimizer(s) to the Policy's own optimizers.