Anti-pattern: Calling ray.get on task arguments harms performance#
TLDR: If possible, pass ObjectRefs
as direct task arguments, instead of passing a list as the task argument and then calling ray.get()
inside the task.
When a task calls ray.get()
, it must block until the value of the ObjectRef
is ready.
If all cores are already occupied, this situation can lead to a deadlock, as the task that produces the ObjectRef
’s value may need the caller task’s resources in order to run.
To handle this issue, if the caller task would block in ray.get()
, Ray temporarily releases the caller’s CPU resources to allow the pending task to run.
This behavior can harm performance and stability because the caller continues to use a process and memory to hold its stack while other tasks run.
Therefore, it is always better to pass ObjectRefs
as direct arguments to a task and avoid calling ray.get
inside of the task, if possible.
For example, in the following code, prefer the latter method of invoking the dependent task.
import ray
import time
@ray.remote
def f():
return 1
@ray.remote
def pass_via_nested_ref(refs):
print(sum(ray.get(refs)))
@ray.remote
def pass_via_direct_arg(*args):
print(sum(args))
# Anti-pattern: Passing nested refs requires `ray.get` in a nested task.
ray.get(pass_via_nested_ref.remote([f.remote() for _ in range(3)]))
# Better approach: Pass refs as direct arguments. Use *args syntax to unpack
# multiple arguments.
ray.get(pass_via_direct_arg.remote(*[f.remote() for _ in range(3)]))
Avoiding ray.get
in nested tasks may not always be possible. Some valid reasons to call ray.get
include:
If the nested task has multiple
ObjectRefs
toray.get
, and it wants to choose the order and number to get.