ray.rllib.policy.torch_policy_v2.TorchPolicyV2.extra_action_out#

Returns dict of extra info to include in experience batch.

Parameters:

input_dict – Dict of model input tensors.
state_batches – List of state tensors.
model – Reference to the model object.
action_dist – Torch action dist object to get log-probs (e.g. for already sampled actions).

Returns:

Extra outputs to return in a compute_actions_from_input_dict() call (3rd return value).