LearnerGroup.additional_update(*, reduce_fn=-1, **kwargs) Dict[str, Any] | List[Dict[str, Any]][source]#

Apply additional non-gradient based updates to the Learners.

For example, this could be used to do a polyak averaging update of a target network in off policy algorithms like SAC or DQN.

By default, this is a pass through that calls all Learner workers’ additional_update(**kwargs) method.


A list of dictionaries of results returned by the Learner.additional_update() calls.