ray.data.DatasetPipeline.count#

DatasetPipeline.count() int[source]#

Count the number of records in the dataset pipeline.

This blocks until the entire pipeline is fully executed.

Time complexity: O(dataset size / parallelism)

Returns

The number of records in the dataset pipeline.