- Policy.learn_on_loaded_batch(offset: int = 0, buffer_index: int = 0)#
Runs a single step of SGD on an already loaded data in a buffer.
Runs an SGD step over a slice of the pre-loaded batch, offset by the
offsetargument (useful for performing n minibatch SGD updates repeatedly on the same, already pre-loaded data).
Updates the model weights based on the averaged per-device gradients.
offset – Offset into the preloaded data. Used for pre-loading a train-batch once to a device, then iterating over (subsampling through) this batch n times doing minibatch SGD.
buffer_index – The index of the buffer (a MultiGPUTowerStack) to take the already pre-loaded data from. The number of buffers on each device depends on the value of the
The outputs of extra_ops evaluated over the batch.