ray.serve.llm.LLMServer.tokenize#
- async LLMServer.tokenize(request: TokenizeRequest, raw_request_info: RawRequestInfo | None = None) AsyncGenerator[TokenizeResponse | ErrorResponse, None][source]#
Tokenize the input text.
- Parameters:
request – A TokenizeRequest object (TokenizeCompletionRequest or TokenizeChatRequest).
raw_request_info – Optional RawRequestInfo containing data from the original HTTP request.
- Returns:
An AsyncGenerator over the TokenizeResponse object.