Replay Buffer API#

The following classes don’t take into account the separation of experiences from different policies, multi-agent replay buffers will be explained further below.

Replay Buffer Base Classes#

StorageUnit(value)

Specifies how batches are structured in a ReplayBuffer.

ReplayBuffer([capacity, storage_unit])

The lowest-level replay buffer interface used by RLlib.

PrioritizedReplayBuffer([capacity, ...])

This buffer implements Prioritized Experience Replay.

ReservoirReplayBuffer([capacity, storage_unit])

This buffer implements reservoir sampling.

Public Methods#

sample([num_items])

Samples num_items items from this buffer.

add(batch, **kwargs)

Adds a batch of experiences or other data to this buffer.

get_state()

Returns all local state in a dict.

set_state(state)

Restores all local state to the provided state.

Multi Agent Buffers#

The following classes use the above, “single-agent”, buffers as underlying buffers to facilitate splitting up experiences between the different agents’ policies. In multi-agent RL, more than one agent exists in the environment and not all of these agents may utilize the same policy (mapping M agents to N policies, where M <= N). This leads to the need for MultiAgentReplayBuffers that store the experiences of different policies separately.

MultiAgentReplayBuffer([capacity, ...])

A replay buffer shard for multiagent setups.

MultiAgentPrioritizedReplayBuffer([...])

A prioritized replay buffer shard for multiagent setups.

Utility Methods#

update_priorities_in_replay_buffer(...)

Updates the priorities in a prioritized replay buffer, given training results.

sample_min_n_steps_from_buffer(...)

Samples a minimum of n timesteps from a given replay buffer.