Note
Ray 2.40 uses RLlib’s new API stack by default. The Ray team has mostly completed transitioning algorithms, example scripts, and documentation to the new code base.
If you’re still using the old API stack, see New API stack migration guide for details on how to migrate.
SingleAgentEnvRunner API#
rllib.env.single_agent_env_runner.SingleAgentEnvRunner#
- class ray.rllib.env.single_agent_env_runner.SingleAgentEnvRunner(*, config: AlgorithmConfig, **kwargs)[source]#
The generic environment runner for the single agent case.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- __init__(*, config: AlgorithmConfig, **kwargs)[source]#
Initializes a SingleAgentEnvRunner instance.
- Parameters:
config – An
AlgorithmConfig
object containing all settings needed to build thisEnvRunner
class.
- sample(*, num_timesteps: int = None, num_episodes: int = None, explore: bool = None, random_actions: bool = False, force_reset: bool = False) List[SingleAgentEpisode] [source]#
Runs and returns a sample (n timesteps or m episodes) on the env(s).
- Parameters:
num_timesteps – The number of timesteps to sample during this call. Note that only one of
num_timetseps
ornum_episodes
may be provided.num_episodes – The number of episodes to sample during this call. Note that only one of
num_timetseps
ornum_episodes
may be provided.explore – If True, will use the RLModule’s
forward_exploration()
method to compute actions. If False, will use the RLModule’sforward_inference()
method. If None (default), will use theexplore
boolean setting fromself.config
passed into this EnvRunner’s constructor. You can change this setting in your config viaconfig.env_runners(explore=True|False)
.random_actions – If True, actions will be sampled randomly (from the action space of the environment). If False (default), actions or action distribution parameters are computed by the RLModule.
force_reset – Whether to force-reset all (vector) environments before sampling. Useful if you would like to collect a clean slate of new episodes via this call. Note that when sampling n episodes (
num_episodes != None
), this is fixed to True.
- Returns:
A list of
SingleAgentEpisode
instances, carrying the sampled data.
- get_metrics() Dict [source]#
Returns metrics (in any form) of the thus far collected, completed episodes.
- Returns:
Metrics of any form.