Replay Buffer API#

The following classes don’t take into account the separation of experiences from different policies, multi-agent replay buffers will be explained further below.

Replay Buffer Base Classes#


Specifies how batches are structured in a ReplayBuffer.

ReplayBuffer([capacity, storage_unit])

The lowest-level replay buffer interface used by RLlib.

PrioritizedReplayBuffer([capacity, ...])

This buffer implements Prioritized Experience Replay.

ReservoirReplayBuffer([capacity, storage_unit])

This buffer implements reservoir sampling.

Public Methods#

sample(num_items, **kwargs)

Samples num_items items from this buffer.

add(batch, **kwargs)

Adds a batch of experiences to this buffer.


Returns all local state.


Restores all local state to the provided state.

Multi Agent Buffers#

The following classes use the above, “single-agent”, buffers as underlying buffers to facilitate splitting up experiences between the different agents’ policies. In multi-agent RL, more than one agent exists in the environment and not all of these agents may utilize the same policy (mapping M agents to N policies, where M <= N). This leads to the need for MultiAgentReplayBuffers that store the experiences of different policies separately.

MultiAgentReplayBuffer([capacity, ...])

A replay buffer shard for multiagent setups.


A prioritized replay buffer shard for multiagent setups.

Utility Methods#


Updates the priorities in a prioritized replay buffer, given training results.


Samples a minimum of n timesteps from a given replay buffer.