ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2.loss#
- EagerTFPolicyV2.loss(model: ModelV2 | tf.keras.Model, dist_class: Type[TFActionDistribution], train_batch: SampleBatch) numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor | List[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor] [source]#
Compute loss for this policy using model, dist_class and a train_batch.
- Parameters:
model – The Model to calculate the loss for.
dist_class – The action distr. class.
train_batch – The training data.
- Returns:
A single loss tensor or a list of loss tensors.