ray.data.Dataset.with_column#

Dataset.with_column(column_name: str, expr: Expr, **ray_remote_args) Dataset[source]#

Add a new column to the dataset via an expression.

Examples

>>> import ray
>>> from ray.data.expressions import col
>>> ds = ray.data.range(100)
>>> ds.with_column("id_2", (col("id") * 2)).schema()
Column  Type
------  ----
id      int64
id_2    int64
Parameters:
  • column_name – The name of the new column.

  • expr – An expression that defines the new column values.

  • **ray_remote_args – Additional resource requirements to request from Ray (e.g., num_gpus=1 to request GPUs for the map tasks). See ray.remote() for details.

Returns:

A new dataset with the added column evaluated via the expression.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.