ray.rllib.policy.torch_policy_v2.TorchPolicyV2.action_distribution_fn#

TorchPolicyV2.action_distribution_fn(model: ray.rllib.models.modelv2.ModelV2, *, obs_batch: Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], state_batches: Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], **kwargs) Tuple[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor], type, List[Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]][source]#

Action distribution function for this Policy.

Parameters
  • model – Underlying model.

  • obs_batch – Observation tensor batch.

  • state_batches – Action sampling state batch.

Returns

Distribution input. ActionDistribution class. State outs.