ray.data.Dataset.schema#

Dataset.schema(fetch_if_missing: bool = True) Union[type, pyarrow.lib.Schema][source]#

Return the schema of the dataset.

For datasets of Arrow records, this will return the Arrow schema. For datasets of Python objects, this returns their Python type.

Note

If this dataset consists of more than a read, or if the schema can’t be determined from the metadata provided by the datasource, or if fetch_if_missing=True (the default), then this operation will trigger execution of the lazy transformations performed on this dataset, and will block until execution completes.

Time complexity: O(1)

Parameters

fetch_if_missing – If True, synchronously fetch the schema if it’s not known. If False, None is returned if the schema is not known. Default is True.

Returns

The Python type or Arrow schema of the records, or None if the schema is not known and fetch_if_missing is False.