Note

Ray 2.10.0 introduces the alpha stage of RLlib’s “new API stack”. The team is currently transitioning algorithms, example scripts, and documentation to the new code base throughout the subsequent minor releases leading up to Ray 3.0.

See here for more details on how to activate and use the new API stack.

External Environments and Applications#

In many situations, it doesn’t make sense for an RL environment to be “stepped” by RLlib. For example, if you train a policy inside a complex simulator that operates its own execution loop, like a game engine or a robotics simulation. A natural and user friendly approach is to flip this setup around and - instead of RLlib “stepping” the env - allow the agents in the simulation to fully control their own stepping. An external RLlib-powered service would be available for either querying individual actions or for accepting batched sample data. The service would cover the task of training the policies, but wouldn’t pose any restrictions on when and how often per second the simulation should step.

../_images/external_env_setup_client_inference.svg

External application with client-side inference: An external simulator (for example a game engine) connects to RLlib, which runs as a server through a tcp-cabable, custom EnvRunner. The simulator sends batches of data from time to time to the server and in turn receives weights updates. For better performance, actions are computed locally on the client side.#

RLlib provides an external messaging protocol called RLlink for this purpose as well as the option to customize your EnvRunner class toward communicating through RLlink with one or more clients. An example, tcp-based EnvRunner implementation with RLlink is available here. It also contains a dummy (CartPole) client that can be used for testing and as a template for how your external application or simulator should utilize the RLlink protocol.

Note

External application support is still work-in-progress on RLlib’s new API stack. The Ray team is working on more examples for custom EnvRunner implementations (besides the already available tcp-based one) as well as various client-side, non-python RLlib-adapters, for example for popular game engines and other simulation software.

Example: External client connecting to tcp-based EnvRunner#

An example tcp-based EnvRunner implementation with RLlink is available here. See here for the full end-to-end example.

Feel free to alter the underlying logic of your custom EnvRunner, for example, you could implement a shared memory based communication layer (instead of the tcp-based one).