External library integrations (tune.integration)
Contents
External library integrations (tune.integration)#
Comet (tune.integration.comet)#
- class ray.air.integrations.comet.CometLoggerCallback(online: bool = True, tags: Optional[List[str]] = None, save_checkpoints: bool = False, **experiment_kwargs)[source]
CometLoggerCallback for logging Tune results to Comet.
Comet (https://comet.ml/site/) is a tool to manage and optimize the entire ML lifecycle, from experiment tracking, model optimization and dataset versioning to model production monitoring.
This Ray Tune
LoggerCallback
sends metrics and parameters to Comet for tracking.In order to use the CometLoggerCallback you must first install Comet via
pip install comet_ml
Then set the following environment variables
export COMET_API_KEY=<Your API Key>
Alternatively, you can also pass in your API Key as an argument to the CometLoggerCallback constructor.
CometLoggerCallback(api_key=<Your API Key>)
- Parameters
online – Whether to make use of an Online or Offline Experiment. Defaults to True.
tags – Tags to add to the logged Experiment. Defaults to None.
save_checkpoints – If
True
, model checkpoints will be saved to Comet ML as artifacts. Defaults toFalse
.**experiment_kwargs – Other keyword arguments will be passed to the constructor for comet_ml.Experiment (or OfflineExperiment if online=False).
Please consult the Comet ML documentation for more information on the Experiment and OfflineExperiment classes: https://comet.ml/site/
Example:
from ray.air.integrations.comet import CometLoggerCallback tune.run( train, config=config callbacks=[CometLoggerCallback( True, ['tag1', 'tag2'], workspace='my_workspace', project_name='my_project_name' )] )
Keras (tune.integration.keras)#
- class ray.tune.integration.keras.TuneReportCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, on: Union[str, List[str]] = 'epoch_end')[source]#
Keras to Ray Tune reporting callback
Reports metrics to Ray Tune.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to Keras, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to Keras. If this is None, all Keras logs will be reported.
on – When to trigger checkpoint creations. Must be one of the Keras event hooks (less the
on_
), e.g. “train_start”, or “predict_end”. Defaults to “epoch_end”.
Example:
from ray.tune.integration.keras import TuneReportCallback # Report accuracy to Tune after each epoch: model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=0, validation_data=(x_test, y_test), callbacks=[TuneReportCallback( {"mean_accuracy": "accuracy"}, on="epoch_end")])
- class ray.tune.integration.keras.TuneReportCheckpointCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, filename: str = 'checkpoint', frequency: Union[int, List[int]] = 1, on: Union[str, List[str]] = 'epoch_end')[source]#
Keras report and checkpoint callback
Saves checkpoints after each validation step. Also reports metrics to Tune, which is needed for checkpoint registration.
Use this callback to register saved checkpoints with Ray Tune. This means that checkpoints will be manages by the
CheckpointManager
and can be used for advanced scheduling and search algorithms, like Population Based Training.The
tf.keras.callbacks.ModelCheckpoint
callback also saves checkpoints, but doesn’t register them with Ray Tune.- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to Keras, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to Keras. If this is None, all Keras logs will be reported.
filename – Filename of the checkpoint within the checkpoint directory. Defaults to “checkpoint”.
frequency – Checkpoint frequency. If this is an integer
n
, checkpoints are saved everyn
times each hook was called. If this is a list, it specifies the checkpoint frequencies for each hook individually.on – When to trigger checkpoint creations. Must be one of the Keras event hooks (less the
on_
), e.g. “train_start”, or “predict_end”. Defaults to “epoch_end”.
Example:
from ray.tune.integration.keras import TuneReportCheckpointCallback # Save checkpoint and report accuracy to Tune after each epoch: model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=0, validation_data=(x_test, y_test), callbacks=[TuneReportCheckpointCallback( metrics={"mean_accuracy": "accuracy"}, filename="model", on="epoch_end")])
MLflow (tune.integration.mlflow)#
- class ray.air.integrations.mlflow.MLflowLoggerCallback(tracking_uri: Optional[str] = None, registry_uri: Optional[str] = None, experiment_name: Optional[str] = None, tags: Optional[Dict] = None, save_artifact: bool = False)[source]
MLflow Logger to automatically log Tune results and config to MLflow.
MLflow (https://mlflow.org) Tracking is an open source library for recording and querying experiments. This Ray Tune
LoggerCallback
sends information (config parameters, training results & metrics, and artifacts) to MLflow for automatic experiment tracking.- Parameters
tracking_uri – The tracking URI for where to manage experiments and runs. This can either be a local file path or a remote server. This arg gets passed directly to mlflow initialization. When using Tune in a multi-node setting, make sure to set this to a remote server and not a local file path.
registry_uri – The registry URI that gets passed directly to mlflow initialization.
experiment_name – The experiment name to use for this Tune run. If the experiment with the name already exists with MLflow, it will be reused. If not, a new experiment will be created with that name.
tags – An optional dictionary of string keys and values to set as tags on the run
save_artifact – If set to True, automatically save the entire contents of the Tune local_dir as an artifact to the corresponding run in MlFlow.
Example:
from ray.air.integrations.mlflow import MLflowLoggerCallback tags = { "user_name" : "John", "git_commit_hash" : "abc123"} tune.run( train_fn, config={ # define search space here "parameter_1": tune.choice([1, 2, 3]), "parameter_2": tune.choice([4, 5, 6]), }, callbacks=[MLflowLoggerCallback( experiment_name="experiment1", tags=tags, save_artifact=True)])
- ray.tune.integration.mlflow.mlflow_mixin(func: Callable)[source]#
MLflow (https://mlflow.org) Tracking is an open source library for recording and querying experiments. This Ray Tune Trainable mixin helps initialize the MLflow API for use with the
Trainable
class or the@mlflow_mixin
function API. This mixin automatically configures MLflow and creates a run in the same process as each Tune trial. You can then use the mlflow API inside the your training function and it will automatically get reported to the correct run.For basic usage, just prepend your training function with the
@mlflow_mixin
decorator:from ray.tune.integration.mlflow import mlflow_mixin @mlflow_mixin def train_fn(config): ... mlflow.log_metric(...)
You can also use MlFlow’s autologging feature if using a training framework like Pytorch Lightning, XGBoost, etc. More information can be found here (https://mlflow.org/docs/latest/tracking.html#automatic-logging).
from ray.tune.integration.mlflow import mlflow_mixin @mlflow_mixin def train_fn(config): mlflow.autolog() xgboost_results = xgb.train(config, ...)
The MlFlow configuration is done by passing a
mlflow
key to theconfig
parameter oftune.Tuner()
(see example below).The content of the
mlflow
config entry is used to configure MlFlow. Here are the keys you can pass in to this config entry:- Parameters
tracking_uri – The tracking URI for MLflow tracking. If using Tune in a multi-node setting, make sure to use a remote server for tracking.
experiment_id – The id of an already created MLflow experiment. All logs from all trials in
tune.Tuner()
will be reported to this experiment. If this is not provided or the experiment with this id does not exist, you must provide an``experiment_name``. This parameter takes precedence overexperiment_name
.experiment_name – The name of an already existing MLflow experiment. All logs from all trials in
tune.Tuner()
will be reported to this experiment. If this is not provided, you must provide a validexperiment_id
.token – A token to use for HTTP authentication when logging to a remote tracking server. This is useful when you want to log to a Databricks server, for example. This value will be used to set the MLFLOW_TRACKING_TOKEN environment variable on all the remote training processes.
Example:
from ray import tune from ray.tune.integration.mlflow import mlflow_mixin import mlflow # Create the MlFlow expriment. mlflow.create_experiment("my_experiment") @mlflow_mixin def train_fn(config): for i in range(10): loss = config["a"] + config["b"] mlflow.log_metric(key="loss", value=loss) tune.report(loss=loss, done=True) tuner = tune.Tuner( train_fn, param_space={ # define search space here "a": tune.choice([1, 2, 3]), "b": tune.choice([4, 5, 6]), # mlflow configuration "mlflow": { "experiment_name": "my_experiment", "tracking_uri": mlflow.get_tracking_uri() } }) tuner.fit()
MXNet (tune.integration.mxnet)#
- class ray.tune.integration.mxnet.TuneReportCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None)[source]#
MXNet to Ray Tune reporting callback
Reports metrics to Ray Tune.
This has to be passed to MXNet as the
eval_end_callback
.- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to MXNet, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to MXNet.
Example:
from ray.tune.integration.mxnet import TuneReportCallback # mlp_model is a MXNet model mlp_model.fit( train_iter, # ... eval_metric="acc", eval_end_callback=TuneReportCallback({ "mean_accuracy": "accuracy" }))
- class ray.tune.integration.mxnet.TuneCheckpointCallback(filename: str = 'checkpoint', frequency: int = 1)[source]#
MXNet checkpoint callback
Saves checkpoints after each epoch.
This has to be passed to the
epoch_end_callback
of the MXNet model.Checkpoint are currently not registered if no
tune.report()
call is made afterwards. You have to use this in conjunction with theTuneReportCallback
to work!- Parameters
filename – Filename of the checkpoint within the checkpoint directory. Defaults to “checkpoint”.
frequency – Integer indicating how often checkpoints should be saved.
Example:
from ray.tune.integration.mxnet import TuneReportCallback, TuneCheckpointCallback # mlp_model is a MXNet model mlp_model.fit( train_iter, # ... eval_metric="acc", eval_end_callback=TuneReportCallback({ "mean_accuracy": "accuracy" }), epoch_end_callback=TuneCheckpointCallback( filename="mxnet_cp", frequency=3 ))
PyTorch Lightning (tune.integration.pytorch_lightning)#
- class ray.tune.integration.pytorch_lightning.TuneReportCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, on: Union[str, List[str]] = 'validation_end')[source]#
PyTorch Lightning to Ray Tune reporting callback
Reports metrics to Ray Tune.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to PyTorch Lightning, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to PyTorch Lightning.
on – When to trigger checkpoint creations. Must be one of the PyTorch Lightning event hooks (less the
on_
), e.g. “train_batch_start”, or “train_end”. Defaults to “validation_end”.
Example:
import pytorch_lightning as pl from ray.tune.integration.pytorch_lightning import TuneReportCallback # Report loss and accuracy to Tune after each validation epoch: trainer = pl.Trainer(callbacks=[TuneReportCallback( ["val_loss", "val_acc"], on="validation_end")]) # Same as above, but report as `loss` and `mean_accuracy`: trainer = pl.Trainer(callbacks=[TuneReportCallback( {"loss": "val_loss", "mean_accuracy": "val_acc"}, on="validation_end")])
PublicAPI: This API is stable across Ray releases.
- class ray.tune.integration.pytorch_lightning.TuneReportCheckpointCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, filename: str = 'checkpoint', on: Union[str, List[str]] = 'validation_end')[source]#
PyTorch Lightning report and checkpoint callback
Saves checkpoints after each validation step. Also reports metrics to Tune, which is needed for checkpoint registration.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to PyTorch Lightning, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to PyTorch Lightning.
filename – Filename of the checkpoint within the checkpoint directory. Defaults to “checkpoint”.
on – When to trigger checkpoint creations. Must be one of the PyTorch Lightning event hooks (less the
on_
), e.g. “train_batch_start”, or “train_end”. Defaults to “validation_end”.
Example:
import pytorch_lightning as pl from ray.tune.integration.pytorch_lightning import ( TuneReportCheckpointCallback) # Save checkpoint after each training batch and after each # validation epoch. trainer = pl.Trainer(callbacks=[TuneReportCheckpointCallback( metrics={"loss": "val_loss", "mean_accuracy": "val_acc"}, filename="trainer.ckpt", on="validation_end")])
PublicAPI: This API is stable across Ray releases.
Weights and Biases (tune.integration.wandb)#
- class ray.air.integrations.wandb.WandbLoggerCallback(project: Optional[str] = None, group: Optional[str] = None, api_key_file: Optional[str] = None, api_key: Optional[str] = None, excludes: Optional[List[str]] = None, log_config: bool = False, save_checkpoints: bool = False, **kwargs)[source]
Weights and biases (https://www.wandb.ai/) is a tool for experiment tracking, model optimization, and dataset versioning. This Ray Tune
LoggerCallback
sends metrics to Wandb for automatic tracking and visualization.- Parameters
project – Name of the Wandb project. Mandatory.
group – Name of the Wandb group. Defaults to the trainable name.
api_key_file – Path to file containing the Wandb API KEY. This file only needs to be present on the node running the Tune script if using the WandbLogger.
api_key – Wandb API Key. Alternative to setting
api_key_file
.excludes – List of metrics that should be excluded from the log.
log_config – Boolean indicating if the
config
parameter of theresults
dict should be logged. This makes sense if parameters will change during training, e.g. with PopulationBasedTraining. Defaults to False.save_checkpoints – If
True
, model checkpoints will be saved to Wandb as artifacts. Defaults toFalse
.**kwargs – The keyword arguments will be pased to
wandb.init()
.
Wandb’s
group
,run_id
andrun_name
are automatically selected by Tune, but can be overwritten by filling out the respective configuration values.Please see here for all other valid configuration settings: https://docs.wandb.ai/library/init
Example:
from ray.tune.logger import DEFAULT_LOGGERS from ray.air.integrations.wandb import WandbLoggerCallback tune.run( train_fn, config={ # define search space here "parameter_1": tune.choice([1, 2, 3]), "parameter_2": tune.choice([4, 5, 6]), }, callbacks=[WandbLoggerCallback( project="Optimization_Project", api_key_file="/path/to/file", log_config=True)])
- ray.air.integrations.wandb.setup_wandb(config: Optional[Dict] = None, rank_zero_only: bool = True, **kwargs) None [source]#
Set up a Weights & Biases session.
This function can be used to initialize a Weights & Biases session in a (distributed) training or tuning run.
By default, the run ID is the trial ID, the run name is the trial name, and the run group is the experiment name. These settings can be overwritten by passing the respective arguments as
kwargs
, which will be passed towandb.init()
.In distributed training with Ray Train, only the zero-rank worker will initialize wandb. All other workers will return a disabled run object, so that logging is not duplicated in a distributed run. This can be disabled by passing
rank_zero_only=False
, which will then initialize wandb in every training worker.The
config
argument will be passed to Weights and Biases and will be logged as the run configuration. If wandb-specific settings are found, they will be used to initialize the session. These settings can beapi_key_file: Path to locally available file containing a W&B API key
api_key: API key to authenticate with W&B
If no API information is found in the config, wandb will try to authenticate using locally stored credentials, created for instance by running
wandb login
.All other keys found in the
wandb
config parameter will be passed towandb.init()
. If the same keys are present in multiple locations, thekwargs
passed tosetup_wandb()
will take precedence over those passed as config keys.- Parameters
config – Configuration dict to be logged to weights and biases. Can contain arguments for
wandb.init()
as well as authentication information.rank_zero_only – If True, will return an initialized session only for the rank 0 worker in distributed training. If False, will initialize a session for all workers.
kwargs – Passed to
wandb.init()
.
Example
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
XGBoost (tune.integration.xgboost)#
- class ray.tune.integration.xgboost.TuneReportCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, results_postprocessing_fn: Optional[Callable[[Dict[str, Union[float, List[float]]]], Dict[str, float]]] = None)[source]#
XGBoost to Ray Tune reporting callback
Reports metrics to Ray Tune.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to XGBoost, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to XGBoost. If this is None, all metrics will be reported to Tune under their default names as obtained from XGBoost.
results_postprocessing_fn – An optional Callable that takes in the dict that will be reported to Tune (after it has been flattened) and returns a modified dict that will be reported instead. Can be used to eg. average results across CV fold when using
xgboost.cv
.
Example:
import xgboost from ray.tune.integration.xgboost import TuneReportCallback config = { # ... "eval_metric": ["auc", "logloss"] } # Report only log loss to Tune after each validation epoch: bst = xgb.train( config, train_set, evals=[(test_set, "eval")], verbose_eval=False, callbacks=[TuneReportCallback({"loss": "eval-logloss"})])
- class ray.tune.integration.xgboost.TuneReportCheckpointCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, filename: str = 'checkpoint', frequency: int = 5, results_postprocessing_fn: Optional[Callable[[Dict[str, Union[float, List[float]]]], float]] = None)[source]#
XGBoost report and checkpoint callback
Saves checkpoints after each validation step. Also reports metrics to Tune, which is needed for checkpoint registration.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to XGBoost, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to XGBoost.
filename – Filename of the checkpoint within the checkpoint directory. Defaults to “checkpoint”. If this is None, all metrics will be reported to Tune under their default names as obtained from XGBoost.
frequency – How often to save checkpoints. Per default, a checkpoint is saved every five iterations.
results_postprocessing_fn – An optional Callable that takes in the dict that will be reported to Tune (after it has been flattened) and returns a modified dict that will be reported instead. Can be used to eg. average results across CV fold when using
xgboost.cv
.
Example:
import xgboost from ray.tune.integration.xgboost import TuneReportCheckpointCallback config = { # ... "eval_metric": ["auc", "logloss"] } # Report only log loss to Tune after each validation epoch. # Save model as `xgboost.mdl`. bst = xgb.train( config, train_set, evals=[(test_set, "eval")], verbose_eval=False, callbacks=[TuneReportCheckpointCallback( {"loss": "eval-logloss"}, "xgboost.mdl)])
LightGBM (tune.integration.lightgbm)#
- class ray.tune.integration.lightgbm.TuneReportCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, results_postprocessing_fn: Optional[Callable[[Dict[str, Union[float, List[float]]]], Dict[str, float]]] = None)[source]#
Create a callback that reports metrics to Ray Tune.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to LightGBM, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to LightGBM. If this is None, all metrics will be reported to Tune under their default names as obtained from LightGBM.
results_postprocessing_fn – An optional Callable that takes in the dict that will be reported to Tune (after it has been flattened) and returns a modified dict that will be reported instead.
Example:
import lightgbm from ray.tune.integration.lightgbm import TuneReportCallback config = { # ... "metric": ["binary_logloss", "binary_error"], } # Report only log loss to Tune after each validation epoch: bst = lightgbm.train( config, train_set, valid_sets=[test_set], valid_names=["eval"], verbose_eval=False, callbacks=[TuneReportCallback({"loss": "eval-binary_logloss"})])
- class ray.tune.integration.lightgbm.TuneReportCheckpointCallback(metrics: Optional[Union[str, List[str], Dict[str, str]]] = None, filename: str = 'checkpoint', frequency: int = 5, results_postprocessing_fn: Optional[Callable[[Dict[str, Union[float, List[float]]]], Dict[str, float]]] = None)[source]#
Creates a callback that reports metrics and checkpoints model.
Saves checkpoints after each validation step. Also reports metrics to Tune, which is needed for checkpoint registration.
- Parameters
metrics – Metrics to report to Tune. If this is a list, each item describes the metric key reported to LightGBM, and it will reported under the same name to Tune. If this is a dict, each key will be the name reported to Tune and the respective value will be the metric key reported to LightGBM.
filename – Filename of the checkpoint within the checkpoint directory. Defaults to “checkpoint”. If this is None, all metrics will be reported to Tune under their default names as obtained from LightGBM.
frequency – How often to save checkpoints. Per default, a checkpoint is saved every five iterations.
results_postprocessing_fn – An optional Callable that takes in the dict that will be reported to Tune (after it has been flattened) and returns a modified dict that will be reported instead.
Example:
import lightgbm from ray.tune.integration.lightgbm import ( TuneReportCheckpointCallback ) config = { # ... "metric": ["binary_logloss", "binary_error"], } # Report only log loss to Tune after each validation epoch. # Save model as `lightgbm.mdl`. bst = lightgbm.train( config, train_set, valid_sets=[test_set], valid_names=["eval"], verbose_eval=False, callbacks=[TuneReportCheckpointCallback( {"loss": "eval-binary_logloss"}, "lightgbm.mdl)])