ray.rllib.utils.policy.compute_log_likelihoods_from_input_dict#

ray.rllib.utils.policy.compute_log_likelihoods_from_input_dict(policy: Policy, batch: SampleBatch | Dict[str, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor | dict | tuple])[source]#

Returns log likelihood for actions in given batch for policy.

Computes likelihoods by passing the observations through the current policy’s compute_log_likelihoods() method

Parameters:

batch – The SampleBatch or MultiAgentBatch to calculate action log likelihoods from. This batch/batches must contain OBS and ACTIONS keys.

Returns:

The probabilities of the actions in the batch, given the observations and the policy.