ray.rllib.utils.replay_buffers.multi_agent_replay_buffer.MultiAgentReplayBuffer.__init__#
- MultiAgentReplayBuffer.__init__(capacity: int = 10000, storage_unit: str = 'timesteps', num_shards: int = 1, replay_mode: str = 'independent', replay_sequence_override: bool = True, replay_sequence_length: int = 1, replay_burn_in: int = 0, replay_zero_init_states: bool = True, underlying_buffer_config: dict = None, **kwargs)[source]#
Initializes a MultiAgentReplayBuffer instance.
- Parameters:
capacity – The capacity of the buffer, measured in
storage_unit
.storage_unit – Either ‘timesteps’, ‘sequences’ or ‘episodes’. Specifies how experiences are stored. If they are stored in episodes, replay_sequence_length is ignored.
num_shards – The number of buffer shards that exist in total (including this one).
replay_mode – One of “independent” or “lockstep”. Determines, whether batches are sampled independently or to an equal amount.
replay_sequence_override – If True, ignore sequences found in incoming batches, slicing them into sequences as specified by
replay_sequence_length
andreplay_sequence_burn_in
. This only has an effect if storage_unit issequences
.replay_sequence_length – The sequence length (T) of a single sample. If > 1, we will sample B x T from this buffer. This only has an effect if storage_unit is ‘timesteps’.
replay_burn_in – This is the number of timesteps each sequence overlaps with the previous one to generate a better internal state (=state after the burn-in), instead of starting from 0.0 each RNN rollout. This only has an effect if storage_unit is
sequences
.replay_zero_init_states – Whether the initial states in the buffer (if replay_sequence_length > 0) are alwayas 0.0 or should be updated with the previous train_batch state outputs.
underlying_buffer_config – A config that contains all necessary constructor arguments and arguments for methods to call on the underlying buffers.
**kwargs – Forward compatibility kwargs.