ray.experimental.register_nixl_memory_pool#
- ray.experimental.register_nixl_memory_pool(size: int, device: torch.device) None[source]#
Pre-allocates a memory pool and registers it with NIXL.
This enables pool-based memory management for NIXL transfers, which can improve performance by avoiding repeated memory registration/deregistration. The pool is registered once with NIXL and individual tensors are copied into it on
ray.put.Within a single
ray.putcall, tensors sharing the same underlying storage (including views) are automatically deduplicated — only one copy of each unique storage is allocated. Across multipleray.putcalls, if the same storage appears again, the existing pool slot is reused without re-copying the data. As a result, data can be potentially stale once youray.putthe storage tensor — subsequent mutations to that storage may not be reflected in outstanding refs. Clone the tensor beforeray.putif snapshot semantics are required.If the pool has insufficient space for an allocation,
NixlOutOfMemoryErroris raised.- Parameters:
size – Size of the memory pool in bytes.
device – Device to allocate the pool on (e.g.,
torch.device("cpu")ortorch.device("cuda")).
Example
import torch import ray from ray.experimental import register_nixl_memory_pool @ray.remote(num_gpus=1, enable_tensor_transport=True) class Trainer: def __init__(self): # Pre-allocate a 1GB GPU memory pool for NIXL transfers register_nixl_memory_pool(1024 * 1024 * 1024, torch.device("cuda")) def get_weight_ref(self): weight = torch.randn(1000, 1000, device="cuda") return ray.put(weight, _tensor_transport="nixl")
PublicAPI (alpha): This API is in alpha and may change before becoming stable.