ray.serve.llm.LLMRouter#

class ray.serve.llm.LLMRouter(**kwargs)[source]#

Bases: OpenAiIngress

Methods

chat

Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.

completions

Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.

detokenize

Convert token IDs back to text.

embeddings

Create embeddings for the provided input.

get_deployment_options

Get the deployment options for the ingress deployment.

model_data

OpenAI API-compliant endpoint to get one rayllm model.

models

OpenAI API-compliant endpoint to get all rayllm models.

score

Create scores for the provided text pairs.

tokenize

Tokenize text into token IDs.

transcriptions

Create transcription for the provided audio input.