ray.serve.llm.configs.ModelLoadingConfig#
- pydantic model ray.serve.llm.configs.ModelLoadingConfig[source]#
The configuration for loading an LLM model.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- field model_source: str | S3MirrorConfig | GCSMirrorConfig | None = None#
Where to obtain the model weights from. Should be a HuggingFace model ID, S3 mirror config, or GCS mirror config. When omitted, defaults to the model_id as a HuggingFace model ID.
- field tokenizer_source: str | None = None#
Where to obtain the tokenizer from. If None, tokenizer is obtained from the model source. Only HuggingFace IDs are supported for now.
- classmethod parse_yaml(file, **kwargs) ModelT #