ray.rllib.utils.exploration.exploration.Exploration#

class ray.rllib.utils.exploration.exploration.Exploration(action_space: gymnasium.spaces.Space, *, framework: str, policy_config: dict, model: ModelV2, num_workers: int, worker_index: int)[source]#

Implements an exploration strategy for Policies.

An Exploration takes model outputs, a distribution, and a timestep from the agent and computes an action to apply to the environment using an implemented exploration schema.

Methods

__init__

param action_space:

The action space in which to explore.

before_compute_actions

Hook for preparations before policy.compute_actions() is called.

get_exploration_action

Returns a (possibly) exploratory action and its log-likelihood.

get_exploration_optimizer

May add optimizer(s) to the Policy's own optimizers.

get_state

Returns the current exploration state.

on_episode_end

Handles necessary exploration logic at the end of an episode.

on_episode_start

Handles necessary exploration logic at the beginning of an episode.

postprocess_trajectory

Handles post-processing of done episode trajectories.

set_state

Sets the Exploration object's state to the given values.