ray.rllib.callbacks.callbacks.RLlibCallback.on_postprocess_trajectory#

RLlibCallback.on_postprocess_trajectory(*, worker: EnvRunner, episode, agent_id: Any, policy_id: str, policies: Dict[str, Policy], postprocessed_batch: SampleBatch, original_batches: Dict[Any, Tuple[Policy, SampleBatch]], **kwargs) None[source]#

Called immediately after a policy’s postprocess_fn is called.

You can use this callback to do additional postprocessing for a policy, including looking at the trajectory data of other agents in multi-agent settings.

Parameters:
  • worker – Reference to the current rollout worker.

  • episode – Episode object.

  • agent_id – Id of the current agent.

  • policy_id – Id of the current policy for the agent.

  • policies – Dict mapping policy IDs to policy objects. In single agent mode there will only be a single “default_policy”.

  • postprocessed_batch – The postprocessed sample batch for this agent. You can mutate this object to apply your own trajectory postprocessing.

  • original_batches – Dict mapping agent IDs to their unpostprocessed trajectory data. You should not mutate this object.

  • kwargs – Forward compatibility placeholder.