ray.tune.schedulers.ResourceChangingScheduler#
- class ray.tune.schedulers.ResourceChangingScheduler(base_scheduler: ~ray.tune.schedulers.trial_scheduler.TrialScheduler | None = None, resources_allocation_function: ~typing.Callable[[TuneController, ~ray.tune.experiment.trial.Trial, ~typing.Dict[str, ~typing.Any], ResourceChangingScheduler], ~ray.tune.execution.placement_groups.PlacementGroupFactory | None] | None = <ray.tune.schedulers.resource_changing_scheduler.DistributeResources object>)[source]#
Bases:
TrialScheduler
A utility scheduler to dynamically change resources of live trials.
Added in version 1.5.0.
Note
Experimental. API may change in future releases.
The ResourceChangingScheduler works by wrapping around any other scheduler and adjusting the resource requirements of live trials in response to the decisions of the wrapped scheduler through a user-specified
resources_allocation_function
.An example of such a function can be found in XGBoost Dynamic Resources Example.
If the functional API is used, the current trial resources can be obtained by calling
tune.get_trial_resources()
inside the training function. The function should be able to load and save checkpoints (the latter preferably every iteration).If the Trainable (class) API is used, you can obtain the current trial resources through the
Trainable.trial_resources
property.Cannot be used if
reuse_actors
is True intune.TuneConfig()
. A ValueError will be raised in that case.- Parameters:
base_scheduler – The scheduler to provide decisions about trials. If None, a default FIFOScheduler will be used.
resources_allocation_function – The callable used to change live trial resource requiements during tuning. This callable will be called on each trial as it finishes one step of training. The callable must take four arguments:
TrialRunner
, currentTrial
, current resultdict
and theResourceChangingScheduler
calling it. The callable must return aPlacementGroupFactory
or None (signifying no need for an update). Ifresources_allocation_function
is None, no resource requirements will be changed at any time. By default,DistributeResources
will be used, distributing available CPUs and GPUs over all running trials in a robust way, without any prioritization.
Warning
If the
resources_allocation_function
sets trial resource requirements to values bigger than possible, the trial will not run. Ensure that your callable accounts for that possibility by setting upper limits. ConsultDistributeResources
to see how that may be done.Example
base_scheduler = ASHAScheduler(max_t=16) def my_resources_allocation_function( tune_controller: "TuneController", trial: Trial, result: Dict[str, Any], scheduler: "ResourceChangingScheduler" ) -> Optional[Union[PlacementGroupFactory, Resource]]: # logic here # usage of PlacementGroupFactory is strongly preferred return PlacementGroupFactory(...) scheduler = ResourceChangingScheduler( base_scheduler, my_resources_allocation_function )
See XGBoost Dynamic Resources Example for a more detailed example.
PublicAPI (beta): This API is in beta and may change before becoming stable.
Methods
Calls user defined resources_allocation_function.
Returns True if new_resources were set.
Attributes
Status for continuing trial execution
Status for pausing trial execution
Status for stopping trial execution