Evaluation and Environment Rollout

Data ingest via either environment rollouts or other data-generating methods (e.g. reading from offline files) is done in RLlib by WorkerSet (together with other parallel RolloutWorkers) in the RLlib Algorithm (under the self.workers property):

../../_images/rollout_worker_class_overview.svg

A typical RLlib WorkerSet setup inside an RLlib Algorithm: Each WorkerSet contains exactly one local RolloutWorker object and n ray remote RolloutWorker (ray actors). The workers contain a policy map (with one or more policies), and - in case a simulator (env) is available - a vectorized BaseEnv (containing m sub-environments) and a SamplerInput (either synchronous or asynchronous) which controls the environment data collection loop. In the online (environment is available) as well as the offline case (no environment), Algorithm uses the sample() method to get SampleBatch objects for training.