ray.rllib.models.modelv2.ModelV2.custom_loss#
- ModelV2.custom_loss(policy_loss: numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor, loss_inputs: Dict[str, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor]) List[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor] | numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor [source]#
Override to customize the loss function used to optimize this model.
This can be used to incorporate self-supervised losses (by defining a loss over existing input and output tensors of this model), and supervised losses (by defining losses over a variable-sharing copy of this model’s layers).
You can find an runnable example in examples/custom_loss.py.
- Parameters:
policy_loss – List of or single policy loss(es) from the policy.
loss_inputs – map of input placeholders for rollout data.
- Returns:
List of or scalar tensor for the customized loss(es) for this model.