ray.data.llm.ProcessorConfig#

pydantic model ray.data.llm.ProcessorConfig[source]#

The processor configuration.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

Show JSON schema
{
   "title": "ProcessorConfig",
   "description": "The processor configuration.\n\n**PublicAPI (alpha):** This API is in alpha and may change before becoming stable.",
   "type": "object",
   "properties": {
      "batch_size": {
         "description": "Large batch sizes are likely to saturate the compute resources and could achieve higher throughput. On the other hand, small batch sizes are more fault-tolerant and could reduce bubbles in the data pipeline. You can tune the batch size to balance the throughput and fault-tolerance based on your use case.",
         "title": "Batch Size",
         "type": "integer"
      },
      "accelerator_type": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "The accelerator type used by the LLM stage in a processor. Default to None, meaning that only the CPU will be used.",
         "title": "Accelerator Type"
      },
      "concurrency": {
         "default": 1,
         "description": "The number of workers for data parallelism. Default to 1.",
         "title": "Concurrency",
         "type": "integer"
      }
   },
   "required": [
      "batch_size"
   ]
}

Config:
  • arbitrary_types_allowed: bool = True

  • validate_assignment: bool = True

Fields: