ray.data.DatasetPipeline.schema#

DatasetPipeline.schema(fetch_if_missing: bool = False) Union[type, pyarrow.lib.Schema][source]#

Return the schema of the dataset pipeline.

For datasets of Arrow records, this will return the Arrow schema. For dataset of Python objects, this returns their Python type.

Note: This is intended to be a method for peeking schema before the execution of DatasetPipeline. If execution has already started, it will simply return the cached schema from the previous call.

Time complexity: O(1)

Parameters

fetch_if_missing – If True, synchronously fetch the schema if it’s not known. Default is False, where None is returned if the schema is not known.

Returns

The Python type or Arrow schema of the records, or None if the schema is not known.