Anti-pattern: Redefining the same remote function or class harms performance#
TLDR: Avoid redefining the same remote function or class.
Decorating the same function or class multiple times using the ray.remote
decorator leads to slow performance in Ray.
For each Ray remote function or class, Ray will pickle it and upload to GCS.
Later on, the worker that runs the task or actor will download and unpickle it.
Each decoration of the same function or class generates a new remote function or class from Ray’s perspective.
As a result, the pickle, upload, download and unpickle work will happen every time we redefine and run the remote function or class.
Code example#
Anti-pattern:
import ray
ray.init()
outputs = []
for i in range(10):
@ray.remote
def double(i):
return i * 2
outputs.append(double.remote(i))
outputs = ray.get(outputs)
# The double remote function is pickled and uploaded 10 times.
Better approach:
@ray.remote
def double(i):
return i * 2
outputs = []
for i in range(10):
outputs.append(double.remote(i))
outputs = ray.get(outputs)
# The double remote function is pickled and uploaded 1 time.
We should define the same remote function or class outside of the loop instead of multiple times inside a loop so that it’s pickled and uploaded only once.