ray.rllib.utils.replay_buffers.utils.update_priorities_in_replay_buffer#
- ray.rllib.utils.replay_buffers.utils.update_priorities_in_replay_buffer(replay_buffer: ReplayBuffer, config: dict, train_batch: SampleBatch | MultiAgentBatch | Dict[str, Any], train_results: Dict) None [source]#
Updates the priorities in a prioritized replay buffer, given training results.
The
abs(TD-error)
from the loss (insidetrain_results
) is used as new priorities for the row-indices that were sampled for the train batch.Don’t do anything if the given buffer does not support prioritized replay.
- Parameters:
replay_buffer – The replay buffer, whose priority values to update. This may also be a buffer that does not support priorities.
config – The Algorithm’s config dict.
train_batch – The batch used for the training update.
train_results – A train results dict, generated by e.g. the
train_one_step()
utility.