ray.serve.request_router.RunningReplica.release_slot#

async RunningReplica.release_slot(slot_token: str) int[source]#

Release a previously reserved slot.

This should be called if a request is not dispatched after reserving a slot (e.g., due to an error or cancellation).

Returns the replica’s reported num_ongoing_requests after the release.