ray.serve.llm.LLMServer.__init__#

async LLMServer.__init__(llm_config: LLMConfig, *, engine_cls: Type[LLMEngine] | None = None, model_downloader: LoraModelLoader | None = None)[source]#

Constructor of LLMServer.

Only the llm_config is public api, the other arguments are private and used for testing.

Parameters:
  • llm_config – LLMConfig for the model.

  • engine_cls – Dependency injection for the vllm engine class. Defaults to VLLMEngine.

  • model_downloader – Dependency injection for the model downloader object. Defaults to be initialized with LoraModelLoader.