ray.rllib.evaluation.worker_set.WorkerSet.add_policy#

WorkerSet.add_policy(policy_id: str, policy_cls: Optional[Type[ray.rllib.policy.policy.Policy]] = None, policy: Optional[ray.rllib.policy.policy.Policy] = None, *, observation_space: Optional[gymnasium.spaces.Space] = None, action_space: Optional[gymnasium.spaces.Space] = None, config: Optional[Union[AlgorithmConfig, dict]] = None, policy_state: Optional[Dict[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor, dict, tuple]]] = None, policy_mapping_fn: Optional[Callable[[Any, int], str]] = None, policies_to_train: Optional[Union[Container[str], Callable[[str, Optional[Union[SampleBatch, MultiAgentBatch]]], bool]]] = None, module_spec: Optional[ray.rllib.core.rl_module.rl_module.SingleAgentRLModuleSpec] = None, workers: Optional[List[Union[ray.rllib.evaluation.rollout_worker.RolloutWorker, ray.actor.ActorHandle]]] = - 1) None[source]#

Adds a policy to this WorkerSet’s workers or a specific list of workers.

Parameters
  • policy_id – ID of the policy to add.

  • policy_cls – The Policy class to use for constructing the new Policy. Note: Only one of policy_cls or policy must be provided.

  • policy – The Policy instance to add to this WorkerSet. If not None, the given Policy object will be directly inserted into the local worker and clones of that Policy will be created on all remote workers. Note: Only one of policy_cls or policy must be provided.

  • observation_space – The observation space of the policy to add. If None, try to infer this space from the environment.

  • action_space – The action space of the policy to add. If None, try to infer this space from the environment.

  • config – The config object or overrides for the policy to add.

  • policy_state – Optional state dict to apply to the new policy instance, right after its construction.

  • policy_mapping_fn – An optional (updated) policy mapping function to use from here on. Note that already ongoing episodes will not change their mapping but will use the old mapping till the end of the episode.

  • policies_to_train – An optional list of policy IDs to be trained or a callable taking PolicyID and SampleBatchType and returning a bool (trainable or not?). If None, will keep the existing setup in place. Policies, whose IDs are not in the list (or for which the callable returns False) will not be updated.

  • module_spec – In the new RLModule API we need to pass in the module_spec for the new module that is supposed to be added. Knowing the policy spec is not sufficient.

  • workers – A list of RolloutWorker/ActorHandles (remote RolloutWorkers) to add this policy to. If defined, will only add the given policy to these workers.

Raises

KeyError – If the given policy_id already exists in this WorkerSet.