ray.rllib.utils.exploration.curiosity.Curiosity.__init__#
- Curiosity.__init__(action_space: gymnasium.spaces.Space, *, framework: str, model: ModelV2, feature_dim: int = 288, feature_net_config: dict | None = None, inverse_net_hiddens: Tuple[int] = (256,), inverse_net_activation: str = 'relu', forward_net_hiddens: Tuple[int] = (256,), forward_net_activation: str = 'relu', beta: float = 0.2, eta: float = 1.0, lr: float = 0.001, sub_exploration: Dict[str, Any] | type | str | None = None, **kwargs)[source]#
Initializes a Curiosity object.
Uses as defaults the hyperparameters described in [1].
- Parameters:
feature_dim – The dimensionality of the feature (phi) vectors.
feature_net_config – Optional model configuration for the feature network, producing feature vectors (phi) from observations. This can be used to configure fcnet- or conv_net setups to properly process any observation space.
inverse_net_hiddens – Tuple of the layer sizes of the inverse (action predicting) NN head (on top of the feature outputs for phi and phi’).
inverse_net_activation – Activation specifier for the inverse net.
forward_net_hiddens – Tuple of the layer sizes of the forward (phi’ predicting) NN head.
forward_net_activation – Activation specifier for the forward net.
beta – Weight for the forward loss (over the inverse loss, which gets weight=1.0-beta) in the common loss term.
eta – Weight for intrinsic rewards before being added to extrinsic ones.
lr – The learning rate for the curiosity-specific optimizer, optimizing feature-, inverse-, and forward nets.
sub_exploration – The config dict for the underlying Exploration to use (e.g. epsilon-greedy for DQN). If None, uses the FromSpecDict provided in the Policy’s default config.