ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2.postprocess_trajectory#

EagerTFPolicyV2.postprocess_trajectory(sample_batch: SampleBatch, other_agent_batches: SampleBatch | None = None, episode=None)[source]#

Post process trajectory in the format of a SampleBatch.

Parameters:
  • sample_batch – sample_batch: batch of experiences for the policy, which will contain at most one episode trajectory.

  • other_agent_batches – In a multi-agent env, this contains a mapping of agent ids to (policy, agent_batch) tuples containing the policy and experiences of the other agents.

  • episode – An optional multi-agent episode object to provide access to all of the internal episode state, which may be useful for model-based or multi-agent algorithms.

Returns:

The postprocessed sample batch.