ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy.init#

EpsilonGreedy.__init__(action_space: gymnasium.spaces.Space, *, framework: str, initial_epsilon: float = 1.0, final_epsilon: float = 0.05, warmup_timesteps: int = 0, epsilon_timesteps: int = 100000, epsilon_schedule: Schedule | None = None, **kwargs)[source]#

Create an EpsilonGreedy exploration class.

Parameters:

action_space – The action space the exploration should occur in.
framework – The framework specifier.
initial_epsilon – The initial epsilon value to use.
final_epsilon – The final epsilon value to use.
warmup_timesteps – The timesteps over which to not change epsilon in the beginning.
epsilon_timesteps – The timesteps (additional to warmup_timesteps) after which epsilon should always be final_epsilon. E.g.: warmup_timesteps=20k epsilon_timesteps=50k -> After 70k timesteps, epsilon will reach its final value.
epsilon_schedule – An optional Schedule object to use (instead of constructing one from the given parameters).

ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy.__init__#

ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy.init#