Loggers (tune.logger)

Tune has default loggers for Tensorboard, CSV, and JSON formats. By default, Tune only logs the returned result dictionaries from the training function.

If you need to log something lower level like model weights or gradients, see Trainable Logging.

Note

Tune’s per-trial Logger classes have been deprecated. They can still be used, but we encourage you to use our new interface with the LoggerCallback class instead.

Custom Loggers

You can create a custom logger by inheriting the LoggerCallback interface (LoggerCallback):

from typing import Dict, List

import json
import os

from ray.tune.logger import LoggerCallback


class CustomLoggerCallback(LoggerCallback):
    """Custom logger interface"""

    def __init__(self, filename: str = "log.txt):
        self._trial_files = {}
        self._filename = filename

    def log_trial_start(self, trial: "Trial"):
        trial_logfile = os.path.join(trial.logdir, self._filename)
        self._trial_files[trial] = open(trial_logfile, "at")

    def log_trial_result(self, iteration: int, trial: "Trial", result: Dict):
        if trial in self._trial_files:
            self._trial_files[trial].write(json.dumps(result))

    def on_trial_complete(self, iteration: int, trials: List["Trial"],
                          trial: "Trial", **info):
        if trial in self._trial_files:
            self._trial_files[trial].close()
            del self._trial_files[trial]

You can then pass in your own logger as follows:

from ray import tune

tune.run(
    MyTrainableClass,
    name="experiment_name",
    callbacks=[CustomLoggerCallback("log_test.txt")]
)

Per default, Ray Tune creates JSON, CSV and TensorboardX logger callbacks if you don’t pass them yourself. You can disable this behavior by setting the TUNE_DISABLE_AUTO_CALLBACK_LOGGERS environment variable to "1".

An example of creating a custom logger can be found in logging_example.

Trainable Logging

By default, Tune only logs the training result dictionaries from your Trainable. However, you may want to visualize the model weights, model graph, or use a custom logging library that requires multi-process logging. For example, you may want to do this if you’re trying to log images to Tensorboard.

You can do this in the trainable, as shown below:

Tip

Make sure that any logging calls or objects stay within scope of the Trainable. You may see Pickling/serialization errors or inconsistent logs otherwise.

Function API:

library refers to whatever 3rd party logging library you are using.

def trainable(config):
    library.init(
        name=trial_id,
        id=trial_id,
        resume=trial_id,
        reinit=True,
        allow_val_change=True)
    library.set_log_path(tune.get_trial_dir())

    for step in range(100):
        library.log_model(...)
        library.log(results, step=step)
        tune.report(results)

Class API:

class CustomLogging(tune.Trainable)
    def setup(self, config):
        trial_id = self.trial_id
        library.init(
            name=trial_id,
            id=trial_id,
            resume=trial_id,
            reinit=True,
            allow_val_change=True)
        library.set_log_path(self.logdir)

    def step(self):
        library.log_model(...)

    def log_result(self, result):
        res_dict = {
            str(k): v
            for k, v in result.items()
            if (v and "config" not in k and not isinstance(v, str))
        }
        step = result["training_iteration"]
        library.log(res_dict, step=step)

Use self.logdir (only for Class API) or tune.get_trial_dir() (only for Function API) for the trial log directory.

In the distributed case, these logs will be sync’ed back to the driver under your logger path. This will allow you to visualize and analyze logs of all distributed training workers on a single machine.

Viskit

Tune automatically integrates with Viskit via the CSVLoggerCallback outputs. To use VisKit (you may have to install some dependencies), run:

$ git clone https://github.com/rll/rllab.git
$ python rllab/rllab/viskit/frontend.py ~/ray_results/my_experiment

The nonrelevant metrics (like timing stats) can be disabled on the left to show only the relevant ones (like accuracy, loss, etc.).

../../_images/ray-tune-viskit.png

TBXLogger

class ray.tune.logger.TBXLoggerCallback[source]

TensorBoardX Logger.

Note that hparams will be written only after a trial has terminated. This logger automatically flattens nested dicts to show on TensorBoard:

{“a”: {“b”: 1, “c”: 2}} -> {“a/b”: 1, “a/c”: 2}

JsonLogger

class ray.tune.logger.JsonLoggerCallback[source]

Logs trial results in json format.

Also writes to a results file and param.json file when results or configurations are updated. Experiments must be executed with the JsonLoggerCallback to be compatible with the ExperimentAnalysis tool.

CSVLogger

class ray.tune.logger.CSVLoggerCallback[source]

Logs results to progress.csv under the trial directory.

Automatically flattens nested dicts in the result dict before writing to csv:

{“a”: {“b”: 1, “c”: 2}} -> {“a/b”: 1, “a/c”: 2}

MLFlowLogger

Tune also provides a default logger for MLflow. You can install MLflow via pip install mlflow. You can see the tutorial here.

WandbLogger

Tune also provides a default logger for Weights & Biases. You can install Wandb via pip install wandb. You can see the tutorial here

LoggerCallback

class ray.tune.logger.LoggerCallback[source]

Base class for experiment-level logger callbacks

This base class defines a general interface for logging events, like trial starts, restores, ends, checkpoint saves, and receiving trial results.

Callbacks implementing this interface should make sure that logging utilities are cleaned up properly on trial termination, i.e. when log_trial_end is received. This includes e.g. closing files.

log_trial_start(trial: Trial)[source]

Handle logging when a trial starts.

Parameters

trial (Trial) – Trial object.

log_trial_restore(trial: Trial)[source]

Handle logging when a trial restores.

Parameters

trial (Trial) – Trial object.

log_trial_save(trial: Trial)[source]

Handle logging when a trial saves a checkpoint.

Parameters

trial (Trial) – Trial object.

log_trial_result(iteration: int, trial: Trial, result: Dict)[source]

Handle logging when a trial reports a result.

Parameters
  • trial (Trial) – Trial object.

  • result (dict) – Result dictionary.

log_trial_end(trial: Trial, failed: bool = False)[source]

Handle logging when a trial ends.

Parameters
  • trial (Trial) – Trial object.

  • failed (bool) – True if the Trial finished gracefully, False if it failed (e.g. when it raised an exception).