Building Custom Policy Classes#


As of Ray >= 1.9, it is no longer recommended to use the build_policy_class() or build_tf_policy() utility functions for creating custom Policy sub-classes. Instead, follow the simple guidelines here for directly sub-classing from either one of the built-in types: DynamicTFPolicy or TorchPolicy

In order to create a custom Policy, sub-class Policy (for a generic, framework-agnostic policy), TorchPolicy (for a PyTorch specific policy), or DynamicTFPolicy (for a TensorFlow specific policy) and override one or more of their methods. Those are in particular:

  • compute_actions_from_input_dict()

  • postprocess_trajectory()

  • loss()

See here for an example on how to override TorchPolicy.