ray.rllib.policy.sample_batch.SampleBatch.right_zero_pad#

SampleBatch.right_zero_pad(max_seq_len: int, exclude_states: bool = True)[source]#

Right (adding zeros at end) zero-pads this SampleBatch in-place.

This will set the self.zero_padded flag to True and self.max_seq_len to the given max_seq_len value.

Parameters:
  • max_seq_len – The max (total) length to zero pad to.

  • exclude_states – If False, also right-zero-pad all state_in_x data. If True, leave state_in_x keys as-is.

Returns:

This very (now right-zero-padded) SampleBatch.

Raises:

ValueError – If self[SampleBatch.SEQ_LENS] is None (not defined).

from ray.rllib.policy.sample_batch import SampleBatch
batch = SampleBatch(
    {"a": [1, 2, 3], "seq_lens": [1, 2]})
print(batch.right_zero_pad(max_seq_len=4))

batch = SampleBatch({"a": [1, 2, 3],
                     "state_in_0": [1.0, 3.0],
                     "seq_lens": [1, 2]})
print(batch.right_zero_pad(max_seq_len=5))
{"a": [1, 0, 0, 0, 2, 3, 0, 0], "seq_lens": [1, 2]}
{"a": [1, 0, 0, 0, 0, 2, 3, 0, 0, 0],
 "state_in_0": [1.0, 3.0],  # <- all state-ins remain as-is
 "seq_lens": [1, 2]}