ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy.__init__#

EpsilonGreedy.__init__(action_space: gymnasium.spaces.Space, *, framework: str, initial_epsilon: float = 1.0, final_epsilon: float = 0.05, warmup_timesteps: int = 0, epsilon_timesteps: int = 100000, epsilon_schedule: Schedule | None = None, **kwargs)[source]#

Create an EpsilonGreedy exploration class.

Parameters:
  • action_space – The action space the exploration should occur in.

  • framework – The framework specifier.

  • initial_epsilon – The initial epsilon value to use.

  • final_epsilon – The final epsilon value to use.

  • warmup_timesteps – The timesteps over which to not change epsilon in the beginning.

  • epsilon_timesteps – The timesteps (additional to warmup_timesteps) after which epsilon should always be final_epsilon. E.g.: warmup_timesteps=20k epsilon_timesteps=50k -> After 70k timesteps, epsilon will reach its final value.

  • epsilon_schedule – An optional Schedule object to use (instead of constructing one from the given parameters).