ray.serve.request_router.MultiplexMixin#

class ray.serve.request_router.MultiplexMixin(*args, **kwargs)[source]#

Mixin for multiplex routing.

This mixin is used to route requests to replicas that are multiplexed. It adds necessary attributes and methods to keep track of multiplexed model IDs and offer the helpers to apply multiplex routing and rank replicas based on multiplexed model IDs.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

Methods

apply_multiplex_routing

Apply multiplex routing to the pending request.

rank_replicas_via_multiplex

Rank the replicas based on the multiplexed model ID.