ray.rllib.evaluation.rollout_worker.RolloutWorker.sample_and_learn#

RolloutWorker.sample_and_learn(expected_batch_size: int, num_sgd_iter: int, sgd_minibatch_size: str, standardize_fields: List[str]) Tuple[dict, int][source]#

Sample and batch and learn on it.

This is typically used in combination with distributed allreduce.

Parameters:
  • expected_batch_size – Expected number of samples to learn on.

  • num_sgd_iter – Number of SGD iterations.

  • sgd_minibatch_size – SGD minibatch size.

  • standardize_fields – List of sample fields to normalize.

Returns:

A tuple consisting of a dictionary of extra metadata returned from

the policies’ learn_on_batch() and the number of samples learned on.