- class ray.tune.schedulers.ResourceChangingScheduler(base_scheduler: Optional[ray.tune.schedulers.trial_scheduler.TrialScheduler] = None, resources_allocation_function: Optional[Callable[[ray.tune.execution.trial_runner.TrialRunner, ray.tune.experiment.trial.Trial, Dict[str, Any], ray.tune.schedulers.resource_changing_scheduler.ResourceChangingScheduler], Optional[Union[ray.tune.resources.Resources, ray.tune.execution.placement_groups.PlacementGroupFactory]]]] = <ray.tune.schedulers.resource_changing_scheduler.DistributeResources object>)[source]#
A utility scheduler to dynamically change resources of live trials.
New in version 1.5.0.
Experimental. API may change in future releases.
The ResourceChangingScheduler works by wrapping around any other scheduler and adjusting the resource requirements of live trials in response to the decisions of the wrapped scheduler through a user-specified
An example of such a function can be found in XGBoost Dynamic Resources Example.
If the functional API is used, the current trial resources can be obtained by calling
tune.get_trial_resources()inside the training function. The function should be able to load and save checkpoints (the latter preferably every iteration).
If the Trainable (class) API is used, you can obtain the current trial resources through the
Cannot be used if
reuse_actorsis True in
tune.TuneConfig(). A ValueError will be raised in that case.
base_scheduler – The scheduler to provide decisions about trials. If None, a default FIFOScheduler will be used.
resources_allocation_function – The callable used to change live trial resource requiements during tuning. This callable will be called on each trial as it finishes one step of training. The callable must take four arguments:
Trial, current result
ResourceChangingSchedulercalling it. The callable must return a
dictor None (signifying no need for an update). If
resources_allocation_functionis None, no resource requirements will be changed at any time. By default,
DistributeResourceswill be used, distributing available CPUs and GPUs over all running trials in a robust way, without any prioritization.
resources_allocation_functionsets trial resource requirements to values bigger than possible, the trial will not run. Ensure that your callable accounts for that possibility by setting upper limits. Consult
DistributeResourcesto see how that may be done.
base_scheduler = ASHAScheduler(max_t=16) def my_resources_allocation_function( trial_runner: "trial_runner.TrialRunner", trial: Trial, result: Dict[str, Any], scheduler: "ResourceChangingScheduler" ) -> Optional[Union[PlacementGroupFactory, Resource]]: # logic here # usage of PlacementGroupFactory is strongly preferred return PlacementGroupFactory(...) scheduler = ResourceChangingScheduler( base_scheduler, my_resources_allocation_function )
See XGBoost Dynamic Resources Example for a more detailed example.
PublicAPI (beta): This API is in beta and may change before becoming stable.
- set_search_properties(metric: Optional[str], mode: Optional[str], **spec) bool [source]#
Pass search properties to scheduler.
This method acts as an alternative to instantiating schedulers that react to metrics with their own
metric – Metric to optimize
mode – One of [“min”, “max”]. Direction to optimize.
**spec – Any kwargs for forward compatiblity. Info like Experiment.PUBLIC_KEYS is provided through here.
- on_trial_add(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, **kwargs)[source]#
Called when a new trial is added to the trial runner.
- on_trial_error(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, **kwargs)[source]#
Notification for the error of trial.
This will only be called when the trial is in the RUNNING state.
- on_trial_result(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, result: Dict) str [source]#
Called on each intermediate result returned by a trial.
At this point, the trial scheduler can make a decision by returning one of CONTINUE, PAUSE, and STOP. This will only be called when the trial is in the RUNNING state.
- on_trial_complete(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, result: Dict, **kwargs)[source]#
Notification for the completion of trial.
This will only be called when the trial is in the RUNNING state and either completes naturally or by manual termination.
- on_trial_remove(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, **kwargs)[source]#
Called to remove trial.
This is called when the trial is in PAUSED or PENDING state. Otherwise, call
- choose_trial_to_run(trial_runner: ray.tune.execution.trial_runner.TrialRunner, **kwargs) Optional[ray.tune.experiment.trial.Trial] [source]#
Called to choose a new trial to run.
This should return one of the trials in trial_runner that is in the PENDING or PAUSED state. This function must be idempotent.
If no trial is ready, return None.
- set_trial_resources(trial: ray.tune.experiment.trial.Trial, new_resources: Union[Dict, ray.tune.execution.placement_groups.PlacementGroupFactory]) bool [source]#
Returns True if new_resources were set.
- reallocate_trial_resources_if_needed(trial_runner: ray.tune.execution.trial_runner.TrialRunner, trial: ray.tune.experiment.trial.Trial, result: Dict) Optional[Union[dict, ray.tune.execution.placement_groups.PlacementGroupFactory]] [source]#
Calls user defined resources_allocation_function. If the returned resources are not none and not the same as currently present, returns them. Otherwise, returns None.