ray.rllib.core.learner.learner.Learner.compute_loss_for_module#

Computes the loss for a single module.

Think of this as computing loss for a single agent. For multi-agent use-cases that require more complicated computation for loss, consider overriding the compute_losses method instead.

Parameters:

module_id – The id of the module.
config – The AlgorithmConfig specific to the given module_id.
batch – The train batch for this particular module.
fwd_out – The output of the forward pass for this particular module.

Returns:

A single total loss tensor. If you have more than one optimizer on the provided module_id and would like to compute gradients separately using these different optimizers, simply add up the individual loss terms for each optimizer and return the sum. Also, for recording/logging any individual loss terms, you can use the Learner.metrics.log_value( key=..., value=...) or Learner.metrics.log_dict() APIs. See: MetricsLogger for more information.