Algorithm.evaluate(duration_fn: Optional[Callable[[int], int]] = None) dict[source]#

Evaluates current policy under evaluation_config settings.

Note that this default implementation does not do anything beyond merging evaluation_config with the normal trainer config.


duration_fn – An optional callable taking the already run num episodes as only arg and returning the number of episodes left to run. It’s used to find out whether evaluation should continue.