ray.data.DatabricksUnityCatalog#

class ray.data.DatabricksUnityCatalog(*, url: str | None = None, token: str | None = None, credential_provider: DatabricksCredentialProvider | None = None, region: str | None = None)[source]#

Bases: Catalog

Databricks Unity Catalog connector.

For Delta and Parquet tables this performs Unity Catalog credential vending (temporary, least-privilege cloud credentials). For Iceberg tables it returns configuration pointing PyIceberg at Unity Catalog’s Iceberg REST catalog endpoint.

Parameters:
  • url – Databricks workspace URL (e.g. "https://dbc-XXXX.cloud.databricks.com"). Required unless credential_provider is given.

  • token – Databricks Personal Access Token with EXTERNAL USE SCHEMA permission. Required unless credential_provider is given.

  • credential_provider – A custom DatabricksCredentialProvider. If provided, url/token are ignored.

  • region – AWS region for S3 access (e.g. "us-west-2"). Required for AWS-backed tables; not needed for Azure/GCP.

Example

>>> import ray
>>> catalog = ray.data.DatabricksUnityCatalog(
...     url="https://dbc-XXXX.cloud.databricks.com",
...     token="dapi...",
...     region="us-west-2",
... )
>>> ds = ray.data.read_delta(
...     "main.sales.transactions", catalog=catalog
... )

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

Methods

infer_format

Best-effort format hint from table metadata or file extension.

Attributes