ray.rllib.callbacks.callbacks.RLlibCallback.on_postprocess_trajectory#

RLlibCallback.on_postprocess_trajectory(*, worker: EnvRunner, episode, agent_id: Any, policy_id: str, policies: Dict[str, Policy], postprocessed_batch: SampleBatch, original_batches: Dict[Any, Tuple[Policy, SampleBatch]], **kwargs) → None[source]#

Called immediately after a policy’s postprocess_fn is called.

You can use this callback to do additional postprocessing for a policy, including looking at the trajectory data of other agents in multi-agent settings.

Parameters:

worker – Reference to the current rollout worker.
episode – Episode object.
agent_id – Id of the current agent.
policy_id – Id of the current policy for the agent.
policies – Dict mapping policy IDs to policy objects. In single agent mode there will only be a single “default_policy”.
postprocessed_batch – The postprocessed sample batch for this agent. You can mutate this object to apply your own trajectory postprocessing.
original_batches – Dict mapping agent IDs to their unpostprocessed trajectory data. You should not mutate this object.
kwargs – Forward compatibility placeholder.