ray.rllib.core.learner.learner_group.LearnerGroup.additional_update#

LearnerGroup.additional_update(*, reduce_fn: ~typing.Callable[[dict | NestedDict], dict | NestedDict] = <function _reduce_mean_results>, **kwargs) → Dict[str, Any] | List[Dict[str, Any]][source]#

Apply additional non-gradient based updates to the Learners.

For example, this could be used to do a polyak averaging update of a target network in off policy algorithms like SAC or DQN.