ray.data.grouped_data.GroupedData.with_column#
- GroupedData.with_column(column_name: str, expr: Expr, **ray_remote_args) Dataset[source]#
Add a new column to each group using an expression.
The supplied expression is evaluated against every row in each group, and the resulting column is appended to the group’s records. The output dataset preserves the original rows and columns.
Examples
>>> import ray >>> from ray.data.expressions import col >>> ds = ( ... ray.data.from_items([{"group": 1, "value": 1}, {"group": 1, "value": 2}]) ... .groupby("group") ... .with_column("value_twice", col("value") * 2) ... .sort(["group", "value"]) ... ) >>> ds.take_all() [{'group': 1, 'value': 1, 'value_twice': 2}, {'group': 1, 'value': 2, 'value_twice': 4}]
- Parameters:
column_name – Name of the column to add.
expr – Expression that yields the values for the new column.
**ray_remote_args – Additional resource requirements to request from Ray for the underlying map tasks (for example,
num_gpus=1).
- Returns:
A new
Datasetcontaining all existing columns plus the newly computed column.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.