Evaluation and Environment Rollout

Data ingest via either environment rollouts or other data-generating methods (e.g. reading from offline files) is done in RLlib by WorkerSet (together with other parallel RolloutWorkers) in the RLlib Trainer (under the self.workers property):

../../_images/rollout_worker_class_overview.svg

A typical RLlib WorkerSet setup inside an RLlib Trainer: Each WorkerSet contains exactly one local RolloutWorker object and n ray remote RolloutWorker (ray actors). The workers contain a policy map (with one or more policies), and - in case a simulator (env) is available - a vectorized BaseEnv (containing m sub-environments) and a SamplerInput (either synchronous or asynchronous) which controls the environment data collection loop. In the online (environment is available) as well as the offline case (no environment), Trainer uses the sample() method to get SampleBatch objects for training.