Evaluation and Environment Rollout
Data ingest via either environment rollouts or other data-generating methods
(e.g. reading from offline files) is done in RLlib by
(together with other parallel
RolloutWorkers) in the RLlib
A typical RLlib WorkerSet setup inside an RLlib Trainer: Each
exactly one local
RolloutWorker object and n ray remote
RolloutWorker (ray actors).
The workers contain a policy map (with one or more policies), and - in case a simulator
(env) is available - a vectorized
(containing m sub-environments) and a
SamplerInput (either synchronous or asynchronous) which controls
the environment data collection loop.
In the online (environment is available) as well as the offline case (no environment),
Trainer uses the
sample() method to
SampleBatch objects for training.