ray.rllib.policy.Policy.apply_gradients#
- Policy.apply_gradients(gradients: List[Tuple[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor]] | List[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor]) None [source]#
Applies the (previously) computed gradients.
Either this in combination with
compute_gradients()
orlearn_on_batch()
must be implemented by subclasses.- Parameters:
gradients – The already calculated gradients to apply to this Policy.