ray.data.Dataset.count#

Dataset.count() int[source]#

Count the number of records in the dataset.

Note

If this dataset consists of more than a read, or if the row count can’t be determined from the metadata provided by the datasource, then this operation will trigger execution of the lazy transformations performed on this dataset.

Time complexity: O(dataset size / parallelism), O(1) for parquet

Examples

>>> import ray
>>> ds = ray.data.range(10)
>>> ds.count()
10
Returns:

The number of records in the dataset.