reserve_slot#

async RunningReplica.reserve_slot(request_metadata: RequestMetadata) → Tuple[str, ReplicaQueueLengthInfo][source]#

Reserve a slot on this replica for an upcoming request.

Returns a unique token that can be used to release the slot later. This is used in the choose_replica/dispatch pattern to track reservations that haven’t been dispatched yet.