ray.data.Dataset.count#

Dataset.count() int[source]#

Count the number of records in the dataset.

Note

If this dataset consists of more than a read, or if the row count can’t be determined from the metadata provided by the datasource, then this operation will trigger execution of the lazy transformations performed on this dataset, and will block until execution completes.

Time complexity: O(dataset size / parallelism), O(1) for parquet

Returns

The number of records in the dataset.