ray.serve.llm.LLMServer.score#
- async LLMServer.score(request: ScoreRequest) AsyncGenerator[ScoreResponse | ErrorResponse, None] [source]#
Runs a score request to the engine and returns the response.
Returns an AsyncGenerator over the ScoreResponse object. This is so that the caller can have a consistent interface across all the methods of chat, completions, embeddings, and score.
- Parameters:
request – A ScoreRequest object.
- Returns:
An AsyncGenerator over the ScoreResponse object.