ray.rllib.utils.torch_utils.apply_grad_clipping(policy: TorchPolicy, optimizer: torch.optim.Optimizer | tf.keras.optimizers.Optimizer, loss: numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor) Dict[str, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor][source]#

Applies gradient clipping to already computed grads inside optimizer.

Note: This function does NOT perform an analogous operation as tf.clip_by_global_norm. It merely clips by norm (per gradient tensor) and then computes the global norm across all given tensors (but without clipping by that global norm).

  • policy – The TorchPolicy, which calculated loss.

  • optimizer – A local torch optimizer object.

  • loss – The torch loss tensor.


An info dict containing the “grad_norm” key and the resulting clipped gradients.