ray.rllib.utils.replay_buffers.prioritized_replay_buffer.PrioritizedReplayBuffer.sample#

PrioritizedReplayBuffer.sample(num_items: int, beta: float, **kwargs) SampleBatch | MultiAgentBatch | Dict[str, Any] | None[source]#

Sample num_items items from this buffer, including prio. weights.

Samples in the results may be repeated.

Examples for storage of SamplesBatches: - If storage unit timesteps has been chosen and batches of size 5 have been added, sample(5) will yield a concatenated batch of 15 timesteps. - If storage unit ‘sequences’ has been chosen and sequences of different lengths have been added, sample(5) will yield a concatenated batch with a number of timesteps equal to the sum of timesteps in the 5 sampled sequences. - If storage unit ‘episodes’ has been chosen and episodes of different lengths have been added, sample(5) will yield a concatenated batch with a number of timesteps equal to the sum of timesteps in the 5 sampled episodes.

Parameters:
  • num_items – Number of items to sample from this buffer.

  • beta – To what degree to use importance weights (0 - no corrections,

  • correction). (1 - full)

  • **kwargs – Forward compatibility kwargs.

Returns:

Concatenated SampleBatch of items including “weights” and “batch_indexes” fields denoting IS of each sampled transition and original idxes in buffer of sampled experiences.

DeveloperAPI: This API may change across minor Ray releases.