ray.serve.llm.LLMServer.detokenize#

async LLMServer.detokenize(request: DetokenizeRequest, raw_request_info: RawRequestInfo | None = None) AsyncGenerator[DetokenizeResponse | ErrorResponse, None][source]#

Detokenize the input token IDs.

Parameters:
  • request – A DetokenizeRequest object.

  • raw_request_info – Optional RawRequestInfo containing data from the original HTTP request.

Returns:

An AsyncGenerator over the DetokenizeResponse object.