ray.rllib.algorithms.algorithm.Algorithm.train_buffered#

Algorithm.train_buffered(buffer_time_s: float, max_buffer_length: int = 1000)#

Runs multiple iterations of training.

Calls train() internally. Collects and combines multiple results. This function will run self.train() repeatedly until one of the following conditions is met: 1) the maximum buffer length is reached, 2) the maximum buffer time is reached, or 3) a checkpoint was created. Even if the maximum time is reached, it will always block until at least one result is received.

Parameters:
  • buffer_time_s – Maximum time to buffer. The next result received after this amount of time has passed will return the whole buffer.

  • max_buffer_length – Maximum number of results to buffer.