ray.data.Dataset.random_sample#
- Dataset.random_sample(fraction: float, *, seed: int | None = None) Dataset [source]#
Returns a new
Dataset
containing a random fraction of the rows.Note
This method returns roughly
fraction * total_rows
rows. An exact number of rows isn’t guaranteed.Examples
>>> import ray >>> ds1 = ray.data.range(100) >>> ds1.random_sample(0.1).count() 10 >>> ds2 = ray.data.range(1000) >>> ds2.random_sample(0.123, seed=42).take(2) [{'id': 2}, {'id': 9}] >>> ds2.random_sample(0.123, seed=42).take(2) [{'id': 2}, {'id': 9}]
- Parameters:
fraction – The fraction of elements to sample.
seed – Seeds the python random pRNG generator.
- Returns:
Returns a
Dataset
containing the sampled rows.