EnvRunnerGroup.sync_weights(policies: List[str] | None = None, from_worker_or_learner_group: EnvRunner | LearnerGroup | None = None, to_worker_indices: List[int] | None = None, global_vars: Dict[str, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor] | None = None, timeout_seconds: float | None = 0.0, inference_only: bool | None = False) None[source]#

Syncs model weights from the given weight source to all remote workers.

Weight source can be either a (local) rollout worker or a learner_group. It should just implement a get_weights method.

  • policies – Optional list of PolicyIDs to sync weights for. If None (default), sync weights to/from all policies.

  • from_worker_or_learner_group – Optional (local) EnvRunner instance or LearnerGroup instance to sync from. If None (default), sync from this EnvRunnerGroup’s local worker.

  • to_worker_indices – Optional list of worker indices to sync the weights to. If None (default), sync to all remote workers.

  • global_vars – An optional global vars dict to set this worker to. If None, do not update the global_vars.

  • timeout_seconds – Timeout in seconds to wait for the sync weights calls to complete. Default is 0 (sync-and-forget, do not wait for any sync calls to finish). This significantly improves algorithm performance.

  • inference_only – Synch weights with workers that keep inference-only modules. This is needed for algorithms in the new stack that use inference-only modules. In this case only a part of the parameters are synced to the workers. Default is False.

DeveloperAPI: This API may change across minor Ray releases.