ray.data.llm.ProcessorConfig#

class ray.data.llm.ProcessorConfig(*, batch_size: int = 32, resources_per_bundle: ~typing.Dict[str, float] | None = None, accelerator_type: str | None = None, concurrency: int | None = 1, experimental: ~typing.Dict[str, ~typing.Any] = <factory>)[source]#

The processor configuration.

Parameters:

batch_size – Configures batch size for the processor. Large batch sizes are likely to saturate the compute resources and could achieve higher throughput. On the other hand, small batch sizes are more fault-tolerant and could reduce bubbles in the data pipeline. You can tune the batch size to balance the throughput and fault-tolerance based on your use case.
resources_per_bundle – The resource bundles for placement groups. You can specify a custom device label e.g. {‘NPU’: 1}. The default resource bundle for LLM Stage is always a GPU resource i.e. {‘GPU’: 1}.
accelerator_type – The accelerator type used by the LLM stage in a processor. Default to None, meaning that only the CPU will be used.
concurrency – The number of workers for data parallelism. Default to 1.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'protected_namespaces': (), 'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].