RLlib Table of Contents


If you encounter errors like blas_thread_init: pthread_create: Resource temporarily unavailable when using many workers, try setting OMP_NUM_THREADS=1. Similarly, check configured system limits with ulimit -a for other resource limit errors.

If you encounter out-of-memory errors, consider setting redis_max_memory and object_store_memory in ray.init() to reduce memory usage.

For debugging unexpected hangs or performance problems, you can run ray stack to dump the stack traces of all Ray workers on the current node, ray timeline to dump a timeline visualization of tasks to a file, and ray memory to list all object references in the cluster.

TensorFlow 2.0

RLlib currently runs in tf.compat.v1 mode. This means eager execution is disabled by default, and RLlib imports TF with import tensorflow.compat.v1 as tf; tf.disable_v2_behaviour(). Eager execution can be enabled manually by calling tf.enable_eager_execution() or setting the "eager": True trainer config.