ray.rllib.core.learner.learner.Learner.compute_loss_for_module#
- abstract Learner.compute_loss_for_module(*, module_id: str, config: AlgorithmConfig, batch: Dict[str, Any], fwd_out: Dict[str, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor]) numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor [source]#
Computes the loss for a single module.
Think of this as computing loss for a single agent. For multi-agent use-cases that require more complicated computation for loss, consider overriding the
compute_losses
method instead.- Parameters:
module_id – The id of the module.
config – The AlgorithmConfig specific to the given
module_id
.batch – The train batch for this particular module.
fwd_out – The output of the forward pass for this particular module.
- Returns:
A single total loss tensor. If you have more than one optimizer on the provided
module_id
and would like to compute gradients separately using these different optimizers, simply add up the individual loss terms for each optimizer and return the sum. Also, for recording/logging any individual loss terms, you can use theLearner.metrics.log_value( key=..., value=...)
orLearner.metrics.log_dict()
APIs. See:MetricsLogger
for more information.