ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2.compute_gradients_fn#

EagerTFPolicyV2.compute_gradients_fn(policy: ray.rllib.policy.policy.Policy, optimizer: Union[torch.optim.Optimizer, tf.keras.optimizers.Optimizer], loss: Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]) Union[List[Tuple[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]], List[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]][source]#

Gradients computing function (from loss tensor, using local optimizer).

Parameters
  • policy – The Policy object that generated the loss tensor and that holds the given local optimizer.

  • optimizer – The tf (local) optimizer object to calculate the gradients with.

  • loss – The loss tensor for which gradients should be calculated.

Returns

List of the possibly clipped gradients- and variable

tuples.

Return type

ModelGradients