ray.data.DatasetPipeline.schema
ray.data.DatasetPipeline.schema#
- DatasetPipeline.schema(fetch_if_missing: bool = False) Union[type, pyarrow.lib.Schema] [source]#
Return the schema of the dataset pipeline.
For datasets of Arrow records, this will return the Arrow schema. For dataset of Python objects, this returns their Python type.
Note: This is intended to be a method for peeking schema before the execution of DatasetPipeline. If execution has already started, it will simply return the cached schema from the previous call.
Time complexity: O(1)
- Parameters
fetch_if_missing – If True, synchronously fetch the schema if it’s not known. Default is False, where None is returned if the schema is not known.
- Returns
The Python type or Arrow schema of the records, or None if the schema is not known.