ray.serve.llm.LLMServer.tokenize#

async LLMServer.tokenize(request: TokenizeRequest, raw_request_info: RawRequestInfo | None = None) AsyncGenerator[TokenizeResponse | ErrorResponse, None][source]#

Tokenize the input text.

Parameters:
  • request – A TokenizeRequest object (TokenizeCompletionRequest or TokenizeChatRequest).

  • raw_request_info – Optional RawRequestInfo containing data from the original HTTP request.

Returns:

An AsyncGenerator over the TokenizeResponse object.