LearnerGroup.additional_update(*, reduce_fn: ~typing.Callable[[dict], dict] = <function _reduce_mean_results>, **kwargs) Dict[str, Any] | List[Dict[str, Any]][source]#

Apply additional non-gradient based updates to the Learners.

For example, this could be used to do a polyak averaging update of a target network in off policy algorithms like SAC or DQN.

By default this is a pass through that calls Learner.additional_update

  • reduce_fn – See update() documentation for more details.

  • **kwargs – Keyword arguments to pass to each Learner.


A list of dictionaries of results from the updates from each worker.