Execution (Tuner, tune.Experiment)
Contents
Execution (Tuner, tune.Experiment)#
Tuner#
- ray.tune.Tuner(trainable: Optional[Union[str, Callable, Type[ray.tune.trainable.trainable.Trainable], BaseTrainer]] = None, *, param_space: Optional[Dict[str, Any]] = None, tune_config: Optional[ray.tune.tune_config.TuneConfig] = None, run_config: Optional[ray.air.config.RunConfig] = None, _tuner_kwargs: Optional[Dict] = None, _tuner_internal: Optional[ray.tune.impl.tuner_internal.TunerInternal] = None)[source]#
Tuner is the recommended way of launching hyperparameter tuning jobs with Ray Tune.
- Parameters
trainable – The trainable to be tuned.
param_space – Search space of the tuning job. One thing to note is that both preprocessor and dataset can be tuned here.
tune_config – Tuning algorithm specific configs. Refer to ray.tune.tune_config.TuneConfig for more info.
run_config – Runtime configuration that is specific to individual trials. If passed, this will overwrite the run config passed to the Trainer, if applicable. Refer to ray.air.config.RunConfig for more info.
Usage pattern:
from sklearn.datasets import load_breast_cancer from ray import tune from ray.data import from_pandas from ray.air.config import RunConfig, ScalingConfig from ray.train.xgboost import XGBoostTrainer from ray.tune.tuner import Tuner def get_dataset(): data_raw = load_breast_cancer(as_frame=True) dataset_df = data_raw["data"] dataset_df["target"] = data_raw["target"] dataset = from_pandas(dataset_df) return dataset trainer = XGBoostTrainer( label_column="target", params={}, datasets={"train": get_dataset()}, ) param_space = { "scaling_config": ScalingConfig( num_workers=tune.grid_search([2, 4]), resources_per_worker={ "CPU": tune.grid_search([1, 2]), }, ), # You can even grid search various datasets in Tune. # "datasets": { # "train": tune.grid_search( # [ds1, ds2] # ), # }, "params": { "objective": "binary:logistic", "tree_method": "approx", "eval_metric": ["logloss", "error"], "eta": tune.loguniform(1e-4, 1e-1), "subsample": tune.uniform(0.5, 1.0), "max_depth": tune.randint(1, 9), }, } tuner = Tuner(trainable=trainer, param_space=param_space, run_config=RunConfig(name="my_tune_run")) analysis = tuner.fit()
To retry a failed tune run, you can then do
tuner = Tuner.restore(experiment_checkpoint_dir) tuner.fit()
experiment_checkpoint_dir
can be easily located near the end of the console output of your first failed run.PublicAPI (beta): This API is in beta and may change before becoming stable.
tune.run_experiments#
- ray.tune.run_experiments(experiments: Union[ray.tune.experiment.experiment.Experiment, Mapping, Sequence[Union[ray.tune.experiment.experiment.Experiment, Mapping]]], scheduler: Optional[ray.tune.schedulers.trial_scheduler.TrialScheduler] = None, server_port: Optional[int] = None, verbose: Union[int, ray.tune.utils.log.Verbosity] = Verbosity.V3_TRIAL_DETAILS, progress_reporter: Optional[ray.tune.progress_reporter.ProgressReporter] = None, resume: Union[bool, str] = False, reuse_actors: Optional[bool] = None, trial_executor: Optional[ray.tune.execution.ray_trial_executor.RayTrialExecutor] = None, raise_on_failed_trial: bool = True, concurrent: bool = True, callbacks: Optional[Sequence[ray.tune.callback.Callback]] = None, _remote: Optional[bool] = None)[source]#
Runs and blocks until all trials finish.
Example
>>> from ray.tune.experiment import Experiment >>> from ray.tune.tune import run_experiments >>> def my_func(config): return {"score": 0} >>> experiment_spec = Experiment("experiment", my_func) >>> run_experiments(experiments=experiment_spec) >>> experiment_spec = {"experiment": {"run": my_func}} >>> run_experiments(experiments=experiment_spec)
- Returns
List of Trial objects, holding data for each executed trial.
PublicAPI: This API is stable across Ray releases.
tune.Experiment#
- ray.tune.Experiment(name: str, run: Union[str, Callable, Type], *, stop: Optional[Union[Mapping, ray.tune.stopper.stopper.Stopper, Callable[[str, Mapping], bool]]] = None, time_budget_s: Optional[Union[int, float, datetime.timedelta]] = None, config: Optional[Dict[str, Any]] = None, resources_per_trial: Union[None, Mapping[str, Union[float, int, Mapping]], PlacementGroupFactory] = None, num_samples: int = 1, local_dir: Optional[str] = None, _experiment_checkpoint_dir: Optional[str] = None, sync_config: Optional[Union[ray.tune.syncer.SyncConfig, dict]] = None, checkpoint_config: Optional[Union[ray.air.config.CheckpointConfig, dict]] = None, trial_name_creator: Optional[Callable[[Trial], str]] = None, trial_dirname_creator: Optional[Callable[[Trial], str]] = None, log_to_file: bool = False, export_formats: Optional[Sequence] = None, max_failures: int = 0, restore: Optional[str] = None)[source]#
Tracks experiment specifications.
Implicitly registers the Trainable if needed. The args here take the same meaning as the arguments defined
tune.py:run
.experiment_spec = Experiment( "my_experiment_name", my_func, stop={"mean_accuracy": 100}, config={ "alpha": tune.grid_search([0.2, 0.4, 0.6]), "beta": tune.grid_search([1, 2]), }, resources_per_trial={ "cpu": 1, "gpu": 0 }, num_samples=10, local_dir="~/ray_results", checkpoint_freq=10, max_failures=2)
- Parameters
TODO (xwjiang) – Add the whole list.
_experiment_checkpoint_dir – Internal use only. If present, use this as the root directory for experiment checkpoint. If not present, the directory path will be deduced from trainable name instead.
DeveloperAPI: This API may change across minor Ray releases.
tune.SyncConfig#
- ray.tune.SyncConfig(upload_dir: Optional[str] = None, syncer: Optional[Union[str, ray.tune.syncer.Syncer]] = 'auto', sync_on_checkpoint: bool = True, sync_period: int = 300, sync_timeout: int = 1800) None [source]#
Configuration object for syncing.
If an
upload_dir
is specified, both experiment and trial checkpoints will be stored on remote (cloud) storage. Synchronization then only happens via this remote storage.- Parameters
upload_dir – Optional URI to sync training results and checkpoints to (e.g.
s3://bucket
,gs://bucket
orhdfs://path
). Specifying this will enable cloud-based checkpointing.syncer – Syncer class to use for synchronizing checkpoints to/from cloud storage. If set to
None
, no syncing will take place. Defaults to"auto"
(auto detect).sync_on_checkpoint – Force sync-down of trial checkpoint to driver (only non cloud-storage). If set to False, checkpoint syncing from worker to driver is asynchronous and best-effort. This does not affect persistent storage syncing. Defaults to True.
sync_period – Syncing period for syncing between nodes.
sync_timeout – Timeout after which running sync processes are aborted. Currently only affects trial-to-cloud syncing.
PublicAPI: This API is stable across Ray releases.