.. include:: /_includes/rllib/we_are_hiring.rst

.. include:: /_includes/rllib/new_api_stack.rst


.. _evaluation-reference-docs:

Sampling the Environment or offline data
========================================

Data ingest via either environment rollouts or other data-generating methods
(e.g. reading from offline files) is done in RLlib by :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker` instances,
which sit inside a :py:class:`~ray.rllib.evaluation.worker_set.WorkerSet`
(together with other parallel ``RolloutWorkers``) in the RLlib :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
(under the ``self.workers`` property):


.. https://docs.google.com/drawings/d/1OewMLAu6KZNon7zpDfZnTh9qiT6m-3M9wnkqWkQQMRc/edit
.. figure:: ../images/rollout_worker_class_overview.svg
    :width: 600
    :align: left

    **A typical RLlib WorkerSet setup inside an RLlib Algorithm:** Each :py:class:`~ray.rllib.evaluation.worker_set.WorkerSet` contains
    exactly one local :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker` object and N ray remote
    :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker` (ray actors).
    The workers contain a policy map (with one or more policies), and - in case a simulator
    (env) is available - a vectorized :py:class:`~ray.rllib.env.base_env.BaseEnv`
    (containing M sub-environments) and a :py:class:`~ray.rllib.evaluation.sampler.SamplerInput` (either synchronous or asynchronous) which controls
    the environment data collection loop.
    In the online case (i.e. environment is available) as well as the offline case (i.e. no environment),
    :py:class:`~ray.rllib.algorithms.algorithm.Algorithm` uses the :py:meth:`~ray.rllib.evaluation.rollout_worker.RolloutWorker.sample` method to
    get :py:class:`~ray.rllib.policy.sample_batch.SampleBatch` objects for training.


.. _rolloutworker-reference-docs:

RolloutWorker API
-----------------

.. currentmodule:: ray.rllib.evaluation.rollout_worker

Constructor
~~~~~~~~~~~


.. autosummary::
   :nosignatures:
   :toctree: doc/

   RolloutWorker

Multi agent
~~~~~~~~~~~

.. autosummary::
   :nosignatures:
   :toctree: doc/

    ~RolloutWorker.add_policy
    ~RolloutWorker.remove_policy
    ~RolloutWorker.get_policy
    ~RolloutWorker.set_is_policy_to_train
    ~RolloutWorker.set_policy_mapping_fn
    ~RolloutWorker.for_policy
    ~RolloutWorker.foreach_policy
    ~RolloutWorker.foreach_policy_to_train

Setter and getter methods
~~~~~~~~~~~~~~~~~~~~~~~~~

.. autosummary::
   :nosignatures:
   :toctree: doc/

    ~RolloutWorker.get_filters
    ~RolloutWorker.get_global_vars
    ~RolloutWorker.set_global_vars
    ~RolloutWorker.get_host
    ~RolloutWorker.get_metrics
    ~RolloutWorker.get_node_ip
    ~RolloutWorker.get_weights
    ~RolloutWorker.set_weights
    ~RolloutWorker.get_state
    ~RolloutWorker.set_state

Threading
~~~~~~~~~
.. autosummary::
   :nosignatures:
   :toctree: doc/

    ~RolloutWorker.lock
    ~RolloutWorker.unlock

Sampling API
~~~~~~~~~~~~

.. autosummary::
   :nosignatures:
   :toctree: doc/

    ~RolloutWorker.sample
    ~RolloutWorker.sample_with_count
    ~RolloutWorker.sample_and_learn

Training API
~~~~~~~~~~~~

.. autosummary::
    :nosignatures:
    :toctree: doc/

    ~RolloutWorker.learn_on_batch
    ~RolloutWorker.setup_torch_data_parallel
    ~RolloutWorker.compute_gradients
    ~RolloutWorker.apply_gradients

Environment API
~~~~~~~~~~~~~~~


.. autosummary::
    :nosignatures:
    :toctree: doc/

    ~RolloutWorker.foreach_env
    ~RolloutWorker.foreach_env_with_context


Miscellaneous
~~~~~~~~~~~~~

.. autosummary::
    :nosignatures:
    :toctree: doc/

    ~RolloutWorker.stop
    ~RolloutWorker.apply
    ~RolloutWorker.sync_filters
    ~RolloutWorker.find_free_port
    ~RolloutWorker.creation_args
    ~RolloutWorker.assert_healthy


.. _workerset-reference-docs:

WorkerSet API
-------------

.. currentmodule:: ray.rllib.evaluation.worker_set

Constructor
~~~~~~~~~~~


.. autosummary::
    :nosignatures:
    :toctree: doc/

    WorkerSet
    WorkerSet.stop
    WorkerSet.reset


Worker Orchestration
~~~~~~~~~~~~~~~~~~~~
.. autosummary::
    :nosignatures:
    :toctree: doc/

    ~WorkerSet.add_workers
    ~WorkerSet.foreach_worker
    ~WorkerSet.foreach_worker_with_id
    ~WorkerSet.foreach_worker_async
    ~WorkerSet.fetch_ready_async_reqs
    ~WorkerSet.num_in_flight_async_reqs
    ~WorkerSet.local_worker
    ~WorkerSet.remote_workers
    ~WorkerSet.num_healthy_remote_workers
    ~WorkerSet.num_healthy_workers
    ~WorkerSet.num_remote_worker_restarts
    ~WorkerSet.probe_unhealthy_workers

Pass-through methods
~~~~~~~~~~~~~~~~~~~~
.. autosummary::
    :nosignatures:
    :toctree: doc/

    ~WorkerSet.add_policy
    ~WorkerSet.foreach_env
    ~WorkerSet.foreach_env_with_context
    ~WorkerSet.foreach_policy
    ~WorkerSet.foreach_policy_to_train
    ~WorkerSet.sync_weights


Sampler API
-----------
:py:class:`~ray.rllib.offline.input_reader.InputReader` instances are used to collect and return experiences from the envs.
For more details on `InputReader` used for offline RL (e.g. reading files of
pre-recorded data), see the :ref:`offline RL API reference here <offline-reference-docs>`.


Input Reader API
~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.offline.input_reader

.. autosummary::
    :nosignatures:
    :toctree: doc/

    InputReader
    InputReader.next


Input Sampler API
~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.evaluation.sampler

.. autosummary::
    :nosignatures:
    :toctree: doc/

    SamplerInput
    SamplerInput.get_data
    SamplerInput.get_extra_batches
    SamplerInput.get_metrics

Synchronous Sampler API
~~~~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.evaluation.sampler

.. autosummary::
    :nosignatures:
    :toctree: doc/

    SyncSampler


.. _offline-reference-docs:

Offline Sampler API
~~~~~~~~~~~~~~~~~~~~~~~

The InputReader API is used by an individual :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker`
to produce batches of experiences either from an simulator or from an
offline source (e.g. a file).

Here are some example extentions of the InputReader API:

JSON reader API
++++++++++++++++

.. currentmodule:: ray.rllib.offline.json_reader

.. autosummary::
    :nosignatures:
    :toctree: doc/

    JsonReader
    JsonReader.read_all_files

.. currentmodule:: ray.rllib.offline.mixed_input

Mixed input reader
++++++++++++++++++
.. autosummary::
    :nosignatures:
    :toctree: doc/

    MixedInput

.. currentmodule:: ray.rllib.offline.d4rl_reader

D4RL reader
+++++++++++
.. autosummary::
    :nosignatures:
    :toctree: doc/

    D4RLReader

.. currentmodule:: ray.rllib.offline.io_context

IOContext
~~~~~~~~~
.. autosummary::
    :nosignatures:
    :toctree: doc/

    IOContext
    IOContext.default_sampler_input


Policy Map API
--------------

.. currentmodule:: ray.rllib.policy.policy_map

.. autosummary::
    :nosignatures:
    :toctree: doc/

    PolicyMap
    PolicyMap.items
    PolicyMap.keys
    PolicyMap.values

Sample batch API
----------------

.. currentmodule:: ray.rllib.policy.sample_batch

.. autosummary::
    :nosignatures:
    :toctree: doc/

    SampleBatch
    SampleBatch.set_get_interceptor
    SampleBatch.is_training
    SampleBatch.set_training
    SampleBatch.as_multi_agent
    SampleBatch.get
    SampleBatch.to_device
    SampleBatch.right_zero_pad
    SampleBatch.slice
    SampleBatch.split_by_episode
    SampleBatch.shuffle
    SampleBatch.columns
    SampleBatch.rows
    SampleBatch.copy
    SampleBatch.is_single_trajectory
    SampleBatch.is_terminated_or_truncated
    SampleBatch.env_steps
    SampleBatch.agent_steps


MultiAgent batch API
--------------------

.. currentmodule:: ray.rllib.policy.sample_batch

.. autosummary::
    :nosignatures:
    :toctree: doc/

    MultiAgentBatch
    MultiAgentBatch.env_steps
    MultiAgentBatch.agent_steps