ray.rllib.policy.sample_batch.SampleBatch#
- class ray.rllib.policy.sample_batch.SampleBatch(*args, **kwargs)[source]#
Bases:
dict
Wrapper around a dictionary with string keys and array-like values.
For example, {“obs”: [1, 2, 3], “reward”: [0, -1, 1]} is a batch of three samples, each with an “obs” and “reward” attribute.
Methods
Constructs a sample batch (same params as dict constructor).
Returns the same as len(self) (number of steps in this batch).
Returns the respective MultiAgentBatch
Returns a list of the batch-data in the specified columns.
Compresses the data buffers (by column) in place.
Concatenates
other
to this one and returns a new SampleBatch.Creates a deep or shallow copy of this SampleBatch and returns it.
Decompresses data buffers (per column if not compressed) in place.
Returns the same as len(self) (number of steps in this batch).
Create a new dictionary with keys from iterable and values set to value.
Returns one column (by key) from the data or a default value.
Creates single ts SampleBatch at given index from
self
.Returns True if this SampleBatch only contains one trajectory.
Returns True if
self
is either terminated or truncated at idx -1.If the key is not found, return the default if given; otherwise, raise a KeyError.
Remove and return a (key, value) pair as a 2-tuple.
Right (adding zeros at end) zero-pads this SampleBatch in-place.
Returns an iterator over data rows, i.e. dicts with column values.
Sets a function to be called on every getitem.
Sets the
is_training
flag for this SampleBatch.Insert key with a value of default if key is not in the dictionary.
Shuffles the rows of this batch in-place.
Returns sum over number of bytes of all data buffers.
Returns a slice of the row data of this batch (w/o copying).
Splits by
eps_id
column and returns list of new batches.Returns SampleBatches, each one representing a k-slice of this one.
TODO: transfer batch to given device as framework tensor.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
Attributes