ray.data.llm.HttpRequestProcessorConfig#
- pydantic model ray.data.llm.HttpRequestProcessorConfig[source]#
The configuration for the HTTP request processor.
Examples
import ray from ray.data.llm import HttpRequestProcessorConfig, build_llm_processor config = HttpRequestProcessorConfig( url="https://api.openai.com/v1/chat/completions", headers={"Authorization": "Bearer sk-..."}, concurrency=1, ) processor = build_llm_processor( config, preprocess=lambda row: dict( model="gpt-4o-mini", messages=[ {"role": "system", "content": "You are a calculator"}, {"role": "user", "content": f"{row['id']} ** 3 = ?"}, ], temperature=0.3, max_tokens=20, ), postprocess=lambda row: dict( resp=row["choices"][0]["message"]["content"], ), ) ds = ray.data.range(10) ds = processor(ds) for row in ds.take_all(): print(row)
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Show JSON schema
{ "title": "HttpRequestProcessorConfig", "description": "The configuration for the HTTP request processor.\n\nExamples:\n .. testcode::\n :skipif: True\n\n import ray\n from ray.data.llm import HttpRequestProcessorConfig, build_llm_processor\n\n config = HttpRequestProcessorConfig(\n url=\"https://api.openai.com/v1/chat/completions\",\n headers={\"Authorization\": \"Bearer sk-...\"},\n concurrency=1,\n )\n processor = build_llm_processor(\n config,\n preprocess=lambda row: dict(\n model=\"gpt-4o-mini\",\n messages=[\n {\"role\": \"system\", \"content\": \"You are a calculator\"},\n {\"role\": \"user\", \"content\": f\"{row['id']} ** 3 = ?\"},\n ],\n temperature=0.3,\n max_tokens=20,\n ),\n postprocess=lambda row: dict(\n resp=row[\"choices\"][0][\"message\"][\"content\"],\n ),\n )\n\n ds = ray.data.range(10)\n ds = processor(ds)\n for row in ds.take_all():\n print(row)\n\n**PublicAPI (alpha):** This API is in alpha and may change before becoming stable.", "type": "object", "properties": { "batch_size": { "default": 64, "description": "The batch size.", "title": "Batch Size", "type": "integer" }, "accelerator_type": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The accelerator type used by the LLM stage in a processor. Default to None, meaning that only the CPU will be used.", "title": "Accelerator Type" }, "concurrency": { "default": 1, "description": "The number of workers for data parallelism. Default to 1.", "title": "Concurrency", "type": "integer" }, "url": { "description": "The URL to query.", "title": "Url", "type": "string" }, "headers": { "anyOf": [ { "type": "object" }, { "type": "null" } ], "default": null, "description": "The query header. Note that we will add 'Content-Type: application/json' to be the header for sure because we only deal with requests body in JSON.", "title": "Headers" }, "qps": { "anyOf": [ { "type": "integer" }, { "type": "null" } ], "default": null, "description": "The maximum number of requests per second to avoid rate limit. If None, the request will be sent sequentially.", "title": "Qps" } }, "required": [ "url" ] }
- Config:
arbitrary_types_allowed: bool = True
validate_assignment: bool = True
- Fields: