ray.serve.request_router.RunningReplica#

class ray.serve.request_router.RunningReplica(replica_info: RunningReplicaInfo)[source]#

Contains info on a running replica. Also defines the interface for a request router to talk to a replica.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

Methods

get_queue_len

Returns current queue len for the replica.

push_proxy_handle

When on proxy, push proxy's self handle to replica

send_request

Send request to this replica.

Attributes

actor_id

Actor ID of this replica.

availability_zone

Availability zone of the node this replica is running on.

is_cross_language

Whether this replica is cross-language (Java).

max_ongoing_requests

Max concurrent requests that can be sent to this replica.

multiplexed_model_ids

Set of model IDs on this replica.

node_id

Node ID of the node this replica is running on.

replica_id

ID of this replica.

routing_stats

Dictionary of routing stats.