ray.rllib.policy.torch_policy_v2.TorchPolicyV2.extra_action_out
ray.rllib.policy.torch_policy_v2.TorchPolicyV2.extra_action_out#
- TorchPolicyV2.extra_action_out(input_dict: Dict[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]], state_batches: List[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]], model: ray.rllib.models.torch.torch_modelv2.TorchModelV2, action_dist: ray.rllib.models.torch.torch_action_dist.TorchDistributionWrapper) Dict[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]] [source]#
Returns dict of extra info to include in experience batch.
- Parameters
input_dict – Dict of model input tensors.
state_batches – List of state tensors.
model – Reference to the model object.
action_dist – Torch action dist object to get log-probs (e.g. for already sampled actions).
- Returns
Extra outputs to return in a
compute_actions_from_input_dict()
call (3rd return value).