ray.serve.llm.configs.ModelLoadingConfig#

pydantic model ray.serve.llm.configs.ModelLoadingConfig[source]#

The configuration for loading an LLM model.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

field model_id: str [Required]#

The ID that should be used by end users to access this model.

field model_source: str | S3MirrorConfig | GCSMirrorConfig | None = None#

Where to obtain the model weights from. Should be a HuggingFace model ID, S3 mirror config, or GCS mirror config. When omitted, defaults to the model_id as a HuggingFace model ID.

field tokenizer_source: str | None = None#

Where to obtain the tokenizer from. If None, tokenizer is obtained from the model source. Only HuggingFace IDs are supported for now.

classmethod parse_yaml(file, **kwargs) ModelT#