ray.data.Dataset.to_spark#
- Dataset.to_spark(spark: pyspark.sql.SparkSession) pyspark.sql.DataFrame [source]#
Convert this
Dataset
into a Spark DataFrame.Note
This operation will trigger execution of the lazy transformations performed on this dataset.
Time complexity: O(dataset size / parallelism)
- Parameters:
spark – A SparkSession, which must be created by RayDP (Spark-on-Ray).
- Returns:
A Spark DataFrame created from this dataset.