ray.rllib.core.rl_module.default_model_config.DefaultModelConfig.log_std_clip_param#
- DefaultModelConfig.log_std_clip_param: float = 20.0#
Whether to clip the log(stddev) when using a DiagGaussian action distribution (or any other continuous control distribution). This can stabilize training and avoid very small or large log(stddev) values leading to numerical instabilities turning outputs to
nan
. The default is to clamp the log(stddev) in between -20 and 20. Set to float(“inf”) for no clamping.