ray.rllib.core.learner.learner.Learner.before_gradient_based_update#
- Learner.before_gradient_based_update(*, timesteps: Dict[str, Any]) None [source]#
Called before gradient-based updates are completed.
Should be overridden to implement custom preparation-, logging-, or non-gradient-based Learner/RLModule update logic before(!) gradient-based updates are performed.
- Parameters:
timesteps – Timesteps dict, which must have the key
NUM_ENV_STEPS_SAMPLED_LIFETIME
. # TODO (sven): Make this a more formal structure with its own type.