ray.serve.llm.LLMRouter#

class ray.serve.llm.LLMRouter(**kwargs)[source]#

Bases: OpenAiIngress

Methods

`chat`	Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.
`completions`	Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.
`detokenize`	Convert token IDs back to text.
`embeddings`	Create embeddings for the provided input.
`get_deployment_options`	Get the deployment options for the ingress deployment.
`model_data`	OpenAI API-compliant endpoint to get one rayllm model.
`models`	OpenAI API-compliant endpoint to get all rayllm models.
`score`	Create scores for the provided text pairs.
`tokenize`	Tokenize text into token IDs.
`transcriptions`	Create transcription for the provided audio input.