ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2#
- class ray.rllib.policy.eager_tf_policy_v2.EagerTFPolicyV2(observation_space: gymnasium.spaces.Space, action_space: gymnasium.spaces.Space, config: dict, **kwargs)[source]#
Bases:
Policy
A TF-eager / TF2 based tensorflow policy.
This class is intended to be used and extended by sub-classing.
Methods
Action distribution function for this Policy.
Custom function for sampling new actions given policy.
Calls the given function with this Policy instance.
Gradients computing function (from loss tensor, using local optimizer).
Gradients computing function (from loss tensor, using local optimizer).
Computes and returns a single (B=1) action value.
Exports Policy checkpoint to a local directory and returns an AIR Checkpoint.
Extra values to fetch and return from compute_actions().
Extra stats to be reported after gradient computation.
Creates new Policy instance(s) from a given Policy or Algorithm checkpoint.
Recovers a Policy from a state object.
Get batch divisibility request.
Get metrics on timing from connectors.
Returns the computer's network name.
Returns the number of currently loaded samples in the given buffer.
Returns tf.Session object to use for computing actions or None.
Gradient stats function.
Imports Policy from local file.
Maximal view requirements dict for
learn_on_batch()
andcompute_actions
calls.Samples a batch from given replay actor and performs an update.
Runs a single step of SGD on an already loaded data in a buffer.
Bulk-loads the given SampleBatch into the devices' memories.
Compute loss for this policy using model, dist_class and a train_batch.
Build underlying model for this Policy.
Removes a time dimension for recurrent RLModules.
Called on an update to global vars.
TF optimizer to use for policy optimization.
Post process trajectory in the format of a SampleBatch.
Reset action- and agent-connectors for this policy.
Restore agent and action connectors if configs available.
Stats function.
Return the list of all savable variables for this policy.