ray.serve.llm.configs.LoraConfig#
- pydantic model ray.serve.llm.configs.LoraConfig[source]#
The configuration for loading an LLM model with LoRA.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- field download_timeout_s: float | None = 30.0#
How much time the download subprocess has to download a single LoRA before a timeout. None means no timeout.
- field dynamic_lora_loading_path: str | None = None#
Cloud storage path where LoRA adapter weights are stored.
- field max_num_adapters_per_replica: PositiveInt = 16#
The maximum number of adapters load on each replica.
- Constraints:
gt = 0
- classmethod parse_yaml(file, **kwargs) ModelT #
- validator validate_dynamic_lora_loading_path » dynamic_lora_loading_path[source]#