ray.rllib.core.learner.learner_group.LearnerGroup.additional_update#

LearnerGroup.additional_update(*, reduce_fn: ~typing.Callable[[dict | NestedDict], dict | NestedDict] = <function _reduce_mean_results>, **kwargs) Dict[str, Any] | List[Dict[str, Any]][source]#

Apply additional non-gradient based updates to the Learners.

For example, this could be used to do a polyak averaging update of a target network in off policy algorithms like SAC or DQN.

By default this is a pass through that calls Learner.additional_update

Parameters:
  • reduce_fn – See update() documentation for more details.

  • **kwargs – Keyword arguments to pass to each Learner.

Returns:

A list of dictionaries of results from the updates from each worker.