ray.rllib.evaluation.rollout_worker.RolloutWorker.apply_gradients#

RolloutWorker.apply_gradients(grads: Union[List[Tuple[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]], List[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]], Dict[str, Union[List[Tuple[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]], List[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]]]]) None[source]#

Applies the given gradients to this worker’s models.

Uses the Policy’s/ies’ apply_gradients method(s) to perform the operations.

Parameters

grads – Single ModelGradients (single-agent case) or a dict mapping PolicyIDs to the respective model gradients structs.

Examples

>>> import gymnasium as gym
>>> from ray.rllib.evaluation.rollout_worker import RolloutWorker
>>> from ray.rllib.algorithms.pg.pg_tf_policy import PGTF1Policy
>>> worker = RolloutWorker( 
...   env_creator=lambda _: gym.make("CartPole-v1"), 
...   default_policy_class=PGTF1Policy) 
>>> samples = worker.sample() 
>>> grads, info = worker.compute_gradients(samples) 
>>> worker.apply_gradients(grads)