ray.train.ScalingConfig#

class ray.train.ScalingConfig(trainer_resources: Dict | Domain | Dict[str, List] | None = None, num_workers: int | Domain | Dict[str, List] | None = None, use_gpu: bool | Domain | Dict[str, List] = False, resources_per_worker: Dict | Domain | Dict[str, List] | None = None, placement_strategy: str | Domain | Dict[str, List] = 'PACK')#

Configuration for scaling training.

Parameters:
  • trainer_resources – Resources to allocate for the trainer. If None is provided, will default to 1 CPU for most trainers.

  • num_workers – The number of workers (Ray actors) to launch. Each worker will reserve 1 CPU by default. The number of CPUs reserved by each worker can be overridden with the resources_per_worker argument.

  • use_gpu – If True, training will be done on GPUs (1 per worker). Defaults to False. The number of GPUs reserved by each worker can be overridden with the resources_per_worker argument.

  • resources_per_worker – If specified, the resources defined in this Dict is reserved for each worker. Define the "CPU" and "GPU" keys (case-sensitive) to override the number of CPU or GPUs used by each worker.

  • placement_strategy – The placement strategy to use for the placement group of the Ray actors. See Placement Group Strategies for the possible options.

Example

from ray.train import ScalingConfig
scaling_config = ScalingConfig(
    # Number of distributed workers.
    num_workers=2,
    # Turn on/off GPU.
    use_gpu=True,
    # Specify resources used for trainer.
    trainer_resources={"CPU": 1},
    # Try to schedule workers on different nodes.
    placement_strategy="SPREAD",
)

Methods

as_placement_group_factory

Returns a PlacementGroupFactory to specify resources for Tune.

from_placement_group_factory

Create a ScalingConfig from a Tune's PlacementGroupFactory

Attributes

additional_resources_per_worker

Resources per worker, not including CPU or GPU resources.

num_cpus_per_worker

The number of CPUs to set per worker.

num_gpus_per_worker

The number of GPUs to set per worker.

num_workers

placement_strategy

resources_per_worker

total_resources

Map of total resources required for the trainer.

trainer_resources

use_gpu