ray.serve.llm.LLMRouter.completions#
- async LLMRouter.completions(body: CompletionRequest) starlette.responses.Response #
Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position.
- Parameters:
body – The CompletionRequest object.
- Returns:
A response object with completions.