ray.data.ActorPoolStrategy#

class ray.data.ActorPoolStrategy(*, size: int | None = None, min_size: int | None = None, max_size: int | None = None, initial_size: int | None = None, max_tasks_in_flight_per_actor: int | None = None)[source]#

Bases: ComputeStrategy

Specify the actor-based compute strategy for a Dataset transform.

ActorPoolStrategy specifies that an autoscaling pool of actors should be used for a given Dataset transform. This is useful for stateful setup of callable classes.

For a fixed-sized pool of size n, use ActorPoolStrategy(size=n).

To autoscale from m to n actors, use ActorPoolStrategy(min_size=m, max_size=n).

To autoscale from m to n actors, with an initial size of initial, use ActorPoolStrategy(min_size=m, max_size=n, initial_size=initial).

To increase opportunities for pipelining task dependency prefetching with computation and avoiding actor startup delays, set max_tasks_in_flight_per_actor to 2 or greater; to try to decrease the delay due to queueing of tasks on the worker actors, set max_tasks_in_flight_per_actor to 1.

Methods

__init__

Construct ActorPoolStrategy for a Dataset transform.