ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2.compute_gradients_fn#
- EagerTFPolicyV2.compute_gradients_fn(policy: Policy, optimizer: torch.optim.Optimizer | tf.keras.optimizers.Optimizer, loss: numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor) List[Tuple[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor]] | List[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor] [source]#
Gradients computing function (from loss tensor, using local optimizer).
- Parameters:
policy – The Policy object that generated the loss tensor and that holds the given local optimizer.
optimizer – The tf (local) optimizer object to calculate the gradients with.
loss – The loss tensor for which gradients should be calculated.
- Returns:
- List of the possibly clipped gradients- and variable
tuples.
- Return type:
ModelGradients