ray.rllib.core.rl_module.default_model_config.DefaultModelConfig.log_std_clip_param#

DefaultModelConfig.log_std_clip_param: float = 20.0#

Whether to clip the log(stddev) when using a DiagGaussian action distribution (or any other continuous control distribution). This can stabilize training and avoid very small or large log(stddev) values leading to numerical instabilities turning outputs to nan. The default is to clamp the log(stddev) in between -20 and 20. Set to float(“inf”) for no clamping.