ray.rllib.utils.exploration.curiosity.Curiosity.postprocess_trajectory#

Curiosity.postprocess_trajectory(policy, sample_batch, tf_sess=None)[source]#

Calculates phi values (obs, obs’, and predicted obs’) and ri.

Also calculates forward and inverse losses and updates the curiosity module on the provided batch using our optimizer.