{ "cells": [ { "cell_type": "markdown", "id": "f37e8a9f", "metadata": {}, "source": [ "# Logging results and uploading models to Weights & Biases\n", "In this example, we train a simple XGBoost model and log the training\n", "results to Weights & Biases. We also save the resulting model checkpoints\n", "as artifacts.\n", "\n", "There are two ways to achieve this:\n", "\n", "1. Automatically using the `ray.air.integrations.wandb.WandbLoggerCallback`\n", "2. Manually using the `wandb` API\n", "\n", "This tutorial will walk you through both options." ] }, { "cell_type": "markdown", "id": "27d04c97", "metadata": {}, "source": [ "Let's start with installing our dependencies:" ] }, { "cell_type": "code", "execution_count": 1, "id": "4e697e5d", "metadata": {}, "outputs": [], "source": [ "!pip install -qU \"ray[tune]\" sklearn xgboost_ray wandb" ] }, { "cell_type": "markdown", "id": "3096e7c9", "metadata": {}, "source": [ "Then we need some imports:" ] }, { "cell_type": "code", "execution_count": 2, "id": "9c286701", "metadata": {}, "outputs": [], "source": [ "import ray\n", "\n", "from ray.air.config import RunConfig, ScalingConfig\n", "from ray.air.result import Result\n", "from ray.air.integrations.wandb import WandbLoggerCallback" ] }, { "cell_type": "markdown", "id": "2efa1564", "metadata": {}, "source": [ "We define a simple function that returns our training dataset as a Ray Dataset:\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "a63ebd10", "metadata": {}, "outputs": [], "source": [ "def get_train_dataset() -> ray.data.Dataset:\n", " dataset = ray.data.read_csv(\"s3://anonymous@air-example-data/breast_cancer.csv\")\n", " return dataset" ] }, { "cell_type": "markdown", "id": "5fc1ca73", "metadata": {}, "source": [ "And that's the common parts. We now dive into the two options to interact with Weights and Biases." ] }, { "cell_type": "markdown", "id": "d07cf41f", "metadata": {}, "source": [ "## Using the WandbLoggerCallback\n", "\n", "The WandbLoggerCallback does all the logging and reporting for you. It is especially useful when you use an out-of-the-box trainer like `XGBoostTrainer`. In these trainers, you don't define your own training loop, so using the AIR W&B callback is the best way to log your results to Weights and Biases.\n", "\n", "First we define a simple training function.\n", "\n", "All the magic happens within the `WandbLoggerCallback`:\n", "\n", "```python\n", "WandbLoggerCallback(\n", " project=wandb_project,\n", " save_checkpoints=True,\n", ")\n", "```\n", "\n", "It will automatically log all results to Weights & Biases and upload the checkpoints as artifacts. It assumes you're logged in into Wandb via an API key or `wandb login`." ] }, { "cell_type": "code", "execution_count": 4, "id": "52edfde0", "metadata": {}, "outputs": [], "source": [ "from ray.train.xgboost import XGBoostTrainer\n", "\n", "\n", "def train_model_xgboost(train_dataset: ray.data.Dataset, wandb_project: str) -> Result:\n", " \"\"\"Train a simple XGBoost model and return the result.\"\"\"\n", " trainer = XGBoostTrainer(\n", " scaling_config=ScalingConfig(num_workers=2),\n", " params={\"tree_method\": \"auto\"},\n", " label_column=\"target\",\n", " datasets={\"train\": train_dataset},\n", " num_boost_round=10,\n", " run_config=RunConfig(\n", " callbacks=[\n", " # This is the part needed to enable logging to Weights & Biases.\n", " # It assumes you've logged in before, e.g. with `wandb login`.\n", " WandbLoggerCallback(\n", " project=wandb_project,\n", " save_checkpoints=True,\n", " )\n", " ]\n", " ),\n", " )\n", " result = trainer.fit()\n", " return result" ] }, { "cell_type": "markdown", "id": "1959ce19", "metadata": {}, "source": [ "Let's kick off a run:" ] }, { "cell_type": "code", "execution_count": null, "id": "64f80d6c", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-10-28 16:28:19,325\tINFO worker.py:1524 -- Started a local Ray instance. View the dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8265 \u001b[39m\u001b[22m\n", "2022-10-28 16:28:22,993\tWARNING read_api.py:297 -- ⚠️ The number of blocks in this dataset (1) limits its parallelism to 1 concurrent tasks. This is much less than the number of available CPU slots in the cluster. Use `.repartition(n)` to increase the number of dataset blocks.\n", "2022-10-28 16:28:26,033\tINFO wandb.py:267 -- Already logged into W&B.\n" ] }, { "data": { "text/html": [], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "wandb_project = \"ray_air_example_xgboost\"\n", "\n", "train_dataset = get_train_dataset()\n", "result = train_model_xgboost(train_dataset=train_dataset, wandb_project=wandb_project)" ] }, { "cell_type": "markdown", "id": "78701c42", "metadata": {}, "source": [ "Check out your [WandB](https://wandb.ai/) project to see the results!" ] }, { "cell_type": "markdown", "id": "a215b6d4", "metadata": {}, "source": [ "## Using the `wandb` API\n", "\n", "When you define your own training loop, you sometimes want to manually interact with the Weights and Biases API. Ray AIR provides a `setup_wandb()` function that takes care of the initialization.\n", "\n", "The main benefit here is that authentication to Weights and Biases is automatically set up for you, and sensible default names for your runs are set. Of course, you can override these.\n", "\n", "Additionally in distributed training you often only want to report the results of the rank 0 worker. This can also be done automatically using our setup.\n", "\n", "Let's define a distributed training loop. The important part here are:\n", "\n", " wandb = setup_wandb(config)\n", " \n", "and later\n", "\n", " wandb.log({\"loss\": loss.item()})\n", " \n", "The call to `setup_wandb()` will setup your session, for instance calling `wandb.init()` with sensible defaults. Because we are in a distributed training setting, this will only happen for the rank 0 - all other workers get a mock object back, and any subsequent calls to `wandb.XXX` will be a no-op for these.\n", "\n", "You can then use the `wandb` as usual:" ] }, { "cell_type": "code", "execution_count": null, "id": "154e233d", "metadata": {}, "outputs": [], "source": [ "from ray.air import session\n", "from ray.air.integrations.wandb import setup_wandb\n", "from ray.data.preprocessors import Concatenator\n", "\n", "import numpy as np\n", "\n", "\n", "import torch.optim as optim\n", "import torch.nn as nn\n", "\n", "def train_loop(config):\n", " wandb = setup_wandb(config)\n", " \n", " dataset = session.get_dataset_shard(\"train\")\n", "\n", " model = nn.Linear(30, 2)\n", "\n", " optimizer = optim.SGD(\n", " model.parameters(),\n", " lr=config.get(\"lr\", 0.01),\n", " )\n", " loss_fn = nn.CrossEntropyLoss()\n", " \n", " for batch in dataset.iter_torch_batches(batch_size=32):\n", " X = batch[\"data\"]\n", " y = batch[\"target\"]\n", " \n", " # Compute prediction error\n", " pred = model(X)\n", " loss = loss_fn(pred, y)\n", "\n", " # Backpropagation\n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()\n", " \n", " session.report({\"loss\": loss.item()})\n", " wandb.log({\"loss\": loss.item()})\n", " " ] }, { "cell_type": "markdown", "id": "9aa12feb", "metadata": {}, "source": [ "Let's define a function to kick off the training - again, we can configure Weights and Biases settings in the config. But you could also just pass it to the setup function, e.g. like this:\n", "\n", " setup_wandb(config, project=\"my_project\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b5ae7c8c", "metadata": {}, "outputs": [], "source": [ "from ray.train.torch import TorchTrainer\n", "\n", "\n", "def train_model_torch(train_dataset: ray.data.Dataset, wandb_project: str) -> Result:\n", " \"\"\"Train a simple XGBoost model and return the result.\"\"\"\n", " trainer = TorchTrainer(\n", " train_loop_per_worker=train_loop,\n", " scaling_config=ScalingConfig(num_workers=2),\n", " train_loop_config={\"lr\": 0.01, \"wandb\": {\"project\": wandb_project}},\n", " datasets={\"train\": train_dataset},\n", " preprocessor=Concatenator(\"data\", dtype=np.float32, exclude=[\"target\"]),\n", " )\n", " result = trainer.fit()\n", " return result" ] }, { "cell_type": "markdown", "id": "12049bcf", "metadata": {}, "source": [ "Let's kick off this run:" ] }, { "cell_type": "code", "execution_count": null, "id": "3825b35b", "metadata": {}, "outputs": [], "source": [ "wandb_project = \"ray_air_example_torch\"\n", "\n", "train_dataset = get_train_dataset()\n", "result = train_model_torch(train_dataset=train_dataset, wandb_project=wandb_project)" ] }, { "cell_type": "markdown", "id": "75fddee7", "metadata": {}, "source": [ "Check out your [WandB](https://wandb.ai/) project to see the results!" ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all", "main_language": "python", "notebook_metadata_filter": "-all" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 5 }