AlgorithmConfig.get_marl_module_spec(*, policy_dict: Dict[str, ray.rllib.policy.policy.PolicySpec], single_agent_rl_module_spec: Optional[ray.rllib.core.rl_module.rl_module.SingleAgentRLModuleSpec] = None) ray.rllib.core.rl_module.marl_module.MultiAgentRLModuleSpec[source]#

Returns the MultiAgentRLModule spec based on the given policy spec dict.

policy_dict could be a partial dict of the policies that we need to turn into an equivalent multi-agent RLModule spec.

  • policy_dict – The policy spec dict. Using this dict, we can determine the inferred values for observation_space, action_space, and config for each policy. If the module spec does not have these values specified, they will get auto-filled with these values obtrained from the policy spec dict. Here we are relying on the policy’s logic for infering these values from other sources of information (e.g. environement)

  • single_agent_rl_module_spec – The SingleAgentRLModuleSpec to use for constructing a MultiAgentRLModuleSpec. If None, the already configured spec (self.rl_module_spec) or the default ModuleSpec for this algorithm (self.get_default_rl_module_spec()) will be used.