- ray.rllib.utils.replay_buffers.utils.update_priorities_in_replay_buffer(replay_buffer: ray.rllib.utils.replay_buffers.replay_buffer.ReplayBuffer, config: dict, train_batch: Union[ray.rllib.policy.sample_batch.SampleBatch, ray.rllib.policy.sample_batch.MultiAgentBatch], train_results: dict) None [source]#
Updates the priorities in a prioritized replay buffer, given training results.
abs(TD-error)from the loss (inside
train_results) is used as new priorities for the row-indices that were sampled for the train batch.
Don’t do anything if the given buffer does not support prioritized replay.
replay_buffer – The replay buffer, whose priority values to update. This may also be a buffer that does not support priorities.
config – The Algorithm’s config dict.
train_batch – The batch used for the training update.
train_results – A train results dict, generated by e.g. the