ray.rllib.algorithms.algorithm_config.AlgorithmConfig#

class ray.rllib.algorithms.algorithm_config.AlgorithmConfig(algo_class: type | None = None)[source]#

Bases: _Config

A RLlib AlgorithmConfig builds an RLlib Algorithm from a given configuration.

from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.algorithms.callbacks import MemoryTrackingCallbacks
# Construct a generic config object, specifying values within different
# sub-categories, e.g. "training".
config = (PPOConfig().training(gamma=0.9, lr=0.01)
        .environment(env="CartPole-v1")
        .resources(num_gpus=0)
        .env_runners(num_env_runners=0)
        .callbacks(MemoryTrackingCallbacks)
    )
# A config object can be used to construct the respective Algorithm.
rllib_algo = config.build()
from ray.rllib.algorithms.ppo import PPOConfig
from ray import tune
# In combination with a tune.grid_search:
config = PPOConfig()
config.training(lr=tune.grid_search([0.01, 0.001]))
# Use `to_dict()` method to get the legacy plain python config dict
# for usage with `tune.Tuner().fit()`.
tune.Tuner("PPO", param_space=config.to_dict())

Methods

__init__

Initializes an AlgorithmConfig instance.

api_stack

Sets the config's API stack settings.

build

Builds an Algorithm from this AlgorithmConfig (or a copy thereof).

build_learner

Builds and returns a new Learner object based on settings in self.

build_learner_group

Builds and returns a new LearnerGroup object based on settings in self.

callbacks

Sets the callbacks configuration.

checkpointing

Sets the config's checkpointing settings.

copy

Creates a deep copy of this config and (un)freezes if necessary.

debugging

Sets the config's debugging settings.

env_runners

Sets the rollout worker configuration.

environment

Sets the config's RL-environment settings.

evaluation

Sets the config's evaluation settings.

experimental

Sets the config's experimental settings.

fault_tolerance

Sets the config's fault tolerance settings.

framework

Sets the config's DL framework settings.

freeze

Freezes this config object, such that no attributes can be set anymore.

from_dict

Creates an AlgorithmConfig from a legacy python config dict.

get

Shim method to help pretend we are a dict.

get_config_for_module

Returns an AlgorithmConfig object, specific to the given module ID.

get_default_learner_class

Returns the Learner class to use for this algorithm.

get_default_rl_module_spec

Returns the RLModule spec to use for this algorithm.

get_evaluation_config_object

Creates a full AlgorithmConfig object from self.evaluation_config.

get_multi_agent_setup

Compiles complete multi-agent config (dict) from the information in self.

get_multi_rl_module_spec

Returns the MultiRLModuleSpec based on the given env/spaces.

get_rl_module_spec

Returns the RLModuleSpec based on the given env/spaces.

get_rollout_fragment_length

Automatically infers a proper rollout_fragment_length setting if "auto".

get_torch_compile_worker_config

Returns the TorchCompileConfig to use on workers.

is_multi_agent

Returns whether this config specifies a multi-agent setup.

items

Shim method to help pretend we are a dict.

keys

Shim method to help pretend we are a dict.

learners

Sets LearnerGroup and Learner worker related configurations.

multi_agent

Sets the config's multi-agent settings.

offline_data

Sets the config's offline data settings.

overrides

Generates and validates a set of config key/value pairs (passed via kwargs).

pop

Shim method to help pretend we are a dict.

python_environment

Sets the config's python environment settings.

reporting

Sets the config's reporting settings.

resources

Specifies resources allocated for an Algorithm and its ray actors/workers.

rl_module

Sets the config's RLModule settings.

serialize

Returns a mapping from str to JSON'able values representing this config.

to_dict

Converts all settings into a legacy config dict for backward compatibility.

training

Sets the training related configuration.

update_from_dict

Modifies this AlgorithmConfig via the provided python config dict.

validate

Validates all values in this config.

validate_train_batch_size_vs_rollout_fragment_length

Detects mismatches for train_batch_size vs rollout_fragment_length.

values

Shim method to help pretend we are a dict.

Attributes

custom_resources_per_worker

delay_between_worker_restarts_s

evaluation_num_workers

ignore_worker_failures

is_atari

True if if specified env is an Atari env.

learner_class

Returns the Learner sub-class to use by this Algorithm.

max_num_worker_restarts

model_config

Defines the model configuration used.

num_consecutive_worker_failures_tolerance

num_cpus_for_local_worker

num_cpus_per_learner_worker

num_cpus_per_worker

num_envs_per_worker

num_gpus_per_learner_worker

num_gpus_per_worker

num_learner_workers

num_rollout_workers

recreate_failed_workers

rl_module_spec

total_train_batch_size

uses_new_env_runners

validate_workers_after_construction

worker_health_probe_timeout_s

worker_restore_timeout_s