ray.data.Dataset.stats#

Dataset.stats() str[source]#

Returns a string containing execution timing information.

Note that this does not trigger execution, so if the dataset has not yet executed, an empty string is returned.

Examples:

import ray

ds = ray.data.range(10)
assert ds.stats() == ""

ds = ds.materialize()
print(ds.stats())
Operator 0 Read: 1 tasks executed, 5 blocks produced in 0s
* Remote wall time: 16.29us min, 7.29ms max, 1.21ms mean, 24.17ms total
* Remote cpu time: 16.0us min, 2.54ms max, 810.45us mean, 16.21ms total
* Peak heap memory usage (MiB): 137968.75 min, 142734.38 max, 139846 mean
* Output num rows: 0 min, 1 max, 0 mean, 10 total
* Output size bytes: 0 min, 8 max, 4 mean, 80 total
* Tasks per node: 20 min, 20 max, 20 mean; 1 nodes used