ray.rllib.utils.exploration.parameter_noise.ParameterNoise#

class ray.rllib.utils.exploration.parameter_noise.ParameterNoise(action_space, *, framework: str, policy_config: dict, model: ModelV2, initial_stddev: float = 1.0, random_timesteps: int = 10000, sub_exploration: dict | None = None, **kwargs)[source]#

Bases: Exploration

An exploration that changes a Model’s parameters.

Implemented based on: [1] https://openai.com/research/better-exploration-with-parameter-noise [2] https://arxiv.org/pdf/1706.01905.pdf

At the beginning of an episode, Gaussian noise is added to all weights of the model. At the end of the episode, the noise is undone and an action diff (pi-delta) is calculated, from which we determine the changes in the noise’s stddev for the next episode.

Methods

__init__

Initializes a ParameterNoise Exploration object.

get_exploration_optimizer

May add optimizer(s) to the Policy's own optimizers.