{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "3b05af3b", "metadata": {}, "source": [ "(tune-comet-ref)=\n", "\n", "# Using Comet with Tune\n", "\n", "[Comet](https://www.comet.ml/site/) is a tool to manage and optimize the\n", "entire ML lifecycle, from experiment tracking, model optimization and dataset\n", "versioning to model production monitoring.\n", "\n", "```{image} /images/comet_logo_full.png\n", ":align: center\n", ":alt: Comet\n", ":height: 120px\n", ":target: https://www.comet.ml/site/\n", "```\n", "\n", "```{contents}\n", ":backlinks: none\n", ":local: true\n", "```\n", "\n", "## Example\n", "\n", "To illustrate logging your trial results to Comet, we'll define a simple training function\n", "that simulates a `loss` metric:" ] }, { "cell_type": "code", "execution_count": 1, "id": "19e3c389", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from ray import train, tune\n", "\n", "\n", "def train_function(config):\n", " for i in range(30):\n", " loss = config[\"mean\"] + config[\"sd\"] * np.random.randn()\n", " train.report({\"loss\": loss})" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6fb69a24", "metadata": {}, "source": [ "Now, given that you provide your Comet API key and your project name like so:" ] }, { "cell_type": "code", "execution_count": 2, "id": "993d5be6", "metadata": {}, "outputs": [], "source": [ "api_key = \"YOUR_COMET_API_KEY\"\n", "project_name = \"YOUR_COMET_PROJECT_NAME\"" ] }, { "cell_type": "code", "execution_count": 3, "id": "e9ce0d76", "metadata": { "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "# This cell is hidden from the rendered notebook. It makes the \n", "from unittest.mock import MagicMock\n", "from ray.air.integrations.comet import CometLoggerCallback\n", "\n", "CometLoggerCallback._logger_process_cls = MagicMock\n", "api_key = \"abc\"\n", "project_name = \"test\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d792a1b0", "metadata": {}, "source": [ "You can add a Comet logger by specifying the `callbacks` argument in your `RunConfig()` accordingly:" ] }, { "cell_type": "code", "execution_count": 4, "id": "dbb761e7", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-07-22 15:41:21,477\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8267\u001b[39m\u001b[22m\n", "/Users/kai/coding/ray/python/ray/tune/trainable/function_trainable.py:643: DeprecationWarning: `checkpoint_dir` in `func(config, checkpoint_dir)` is being deprecated. To save and load checkpoint in trainable functions, please use the `ray.air.session` API:\n", "\n", "from ray.air import session\n", "\n", "def train(config):\n", " # ...\n", " session.report({\"metric\": metric}, checkpoint=checkpoint)\n", "\n", "For more information please see https://docs.ray.io/en/master/ray-air/key-concepts.html#session\n", "\n", " DeprecationWarning,\n" ] }, { "data": { "text/html": [ "== Status ==
Current time: 2022-07-22 15:41:31 (running for 00:00:06.73)
Memory usage on this node: 9.9/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/4.5 GiB heap, 0.0/2.0 GiB objects
Current best trial: 5bf98_00000 with loss=1.0234101880766688 and parameters={'mean': 1, 'sd': 0.40575843135279466}
Result logdir: /Users/kai/ray_results/train_function_2022-07-22_15-41-18
Number of trials: 3/3 (3 TERMINATED)
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Trial name status loc mean sd iter total time (s) loss
train_function_5bf98_00000TERMINATED127.0.0.1:48140 10.405758 30 2.11758 1.02341
train_function_5bf98_00001TERMINATED127.0.0.1:48147 20.647335 30 0.07707311.53993
train_function_5bf98_00002TERMINATED127.0.0.1:48151 30.256568 30 0.07284313.0393


" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "2022-07-22 15:41:24,693\tINFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json'].\n", "COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting.\n", "COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged \n", "For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/\n", "COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting.\n", "COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged \n", "For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/\n", "COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting.\n", "COMET ERROR: The given API key abc is invalid, please check it against the dashboard. Your experiment would not be logged \n", "For more details, please refer to: https://www.comet.ml/docs/python-sdk/warnings-errors/\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Result for train_function_5bf98_00000:\n", " date: 2022-07-22_15-41-27\n", " done: false\n", " experiment_id: c94e6cdedd4540e4b40e4a34fbbeb850\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 1\n", " loss: 1.1009860426725162\n", " node_ip: 127.0.0.1\n", " pid: 48140\n", " time_since_restore: 0.000125885009765625\n", " time_this_iter_s: 0.000125885009765625\n", " time_total_s: 0.000125885009765625\n", " timestamp: 1658500887\n", " timesteps_since_restore: 0\n", " training_iteration: 1\n", " trial_id: 5bf98_00000\n", " warmup_time: 0.0029532909393310547\n", " \n", "Result for train_function_5bf98_00000:\n", " date: 2022-07-22_15-41-29\n", " done: true\n", " experiment_id: c94e6cdedd4540e4b40e4a34fbbeb850\n", " experiment_tag: 0_mean=1,sd=0.4058\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 30\n", " loss: 1.0234101880766688\n", " node_ip: 127.0.0.1\n", " pid: 48140\n", " time_since_restore: 2.1175789833068848\n", " time_this_iter_s: 0.0022211074829101562\n", " time_total_s: 2.1175789833068848\n", " timestamp: 1658500889\n", " timesteps_since_restore: 0\n", " training_iteration: 30\n", " trial_id: 5bf98_00000\n", " warmup_time: 0.0029532909393310547\n", " \n", "Result for train_function_5bf98_00001:\n", " date: 2022-07-22_15-41-30\n", " done: false\n", " experiment_id: ba865bc613d94413a37fe027123ba031\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 1\n", " loss: 2.3754716847171182\n", " node_ip: 127.0.0.1\n", " pid: 48147\n", " time_since_restore: 0.0001590251922607422\n", " time_this_iter_s: 0.0001590251922607422\n", " time_total_s: 0.0001590251922607422\n", " timestamp: 1658500890\n", " timesteps_since_restore: 0\n", " training_iteration: 1\n", " trial_id: 5bf98_00001\n", " warmup_time: 0.0036537647247314453\n", " \n", "Result for train_function_5bf98_00001:\n", " date: 2022-07-22_15-41-30\n", " done: true\n", " experiment_id: ba865bc613d94413a37fe027123ba031\n", " experiment_tag: 1_mean=2,sd=0.6473\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 30\n", " loss: 1.5399275480220707\n", " node_ip: 127.0.0.1\n", " pid: 48147\n", " time_since_restore: 0.0770730972290039\n", " time_this_iter_s: 0.002664804458618164\n", " time_total_s: 0.0770730972290039\n", " timestamp: 1658500890\n", " timesteps_since_restore: 0\n", " training_iteration: 30\n", " trial_id: 5bf98_00001\n", " warmup_time: 0.0036537647247314453\n", " \n", "Result for train_function_5bf98_00002:\n", " date: 2022-07-22_15-41-31\n", " done: false\n", " experiment_id: 2efb6f3c4d954bcab1ea4083f138008e\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 1\n", " loss: 3.204653294422825\n", " node_ip: 127.0.0.1\n", " pid: 48151\n", " time_since_restore: 0.00014400482177734375\n", " time_this_iter_s: 0.00014400482177734375\n", " time_total_s: 0.00014400482177734375\n", " timestamp: 1658500891\n", " timesteps_since_restore: 0\n", " training_iteration: 1\n", " trial_id: 5bf98_00002\n", " warmup_time: 0.0030150413513183594\n", " \n", "Result for train_function_5bf98_00002:\n", " date: 2022-07-22_15-41-31\n", " done: true\n", " experiment_id: 2efb6f3c4d954bcab1ea4083f138008e\n", " experiment_tag: 2_mean=3,sd=0.2566\n", " hostname: Kais-MacBook-Pro.local\n", " iterations_since_restore: 30\n", " loss: 3.0393011150182865\n", " node_ip: 127.0.0.1\n", " pid: 48151\n", " time_since_restore: 0.07284307479858398\n", " time_this_iter_s: 0.0020139217376708984\n", " time_total_s: 0.07284307479858398\n", " timestamp: 1658500891\n", " timesteps_since_restore: 0\n", " training_iteration: 30\n", " trial_id: 5bf98_00002\n", " warmup_time: 0.0030150413513183594\n", " \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-07-22 15:41:31,290\tINFO tune.py:738 -- Total run time: 7.36 seconds (6.72 seconds for the tuning loop).\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "{'mean': 1, 'sd': 0.40575843135279466}\n" ] } ], "source": [ "from ray.air.integrations.comet import CometLoggerCallback\n", "\n", "tuner = tune.Tuner(\n", " train_function,\n", " tune_config=tune.TuneConfig(\n", " metric=\"loss\",\n", " mode=\"min\",\n", " ),\n", " run_config=train.RunConfig(\n", " callbacks=[\n", " CometLoggerCallback(\n", " api_key=api_key, project_name=project_name, tags=[\"comet_example\"]\n", " )\n", " ],\n", " ),\n", " param_space={\"mean\": tune.grid_search([1, 2, 3]), \"sd\": tune.uniform(0.2, 0.8)},\n", ")\n", "results = tuner.fit()\n", "\n", "print(results.get_best_result().config)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d7e46189", "metadata": {}, "source": [ "## Tune Comet Logger\n", "\n", "Ray Tune offers an integration with Comet through the `CometLoggerCallback`,\n", "which automatically logs metrics and parameters reported to Tune to the Comet UI.\n", "\n", "Click on the following dropdown to see this callback API in detail:\n", "\n", "```{eval-rst}\n", ".. autoclass:: ray.air.integrations.comet.CometLoggerCallback\n", " :noindex:\n", "```" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" }, "orphan": true }, "nbformat": 4, "nbformat_minor": 5 }