ray.data.Dataset.rename_columns#
- Dataset.rename_columns(names: List[str] | Dict[str, str], *, concurrency: int | Tuple[int, int] | None = None, **ray_remote_args)[source]#
Rename columns in the dataset.
Examples
>>> import ray >>> ds = ray.data.read_parquet("s3://anonymous@ray-example-data/iris.parquet") >>> ds.schema() Column Type ------ ---- sepal.length double sepal.width double petal.length double petal.width double variety string
You can pass a dictionary mapping old column names to new column names.
>>> ds.rename_columns({"variety": "category"}).schema() Column Type ------ ---- sepal.length double sepal.width double petal.length double petal.width double category string
Or you can pass a list of new column names.
>>> ds.rename_columns( ... ["sepal_length", "sepal_width", "petal_length", "petal_width", "variety"] ... ).schema() Column Type ------ ---- sepal_length double sepal_width double petal_length double petal_width double variety string
- Parameters:
mapper – A dictionary that maps old column names to new column names, or a list of new column names.
concurrency – The maximum number of Ray workers to use concurrently.
ray_remote_args – Additional resource requirements to request from Ray (e.g., num_gpus=1 to request GPUs for the map tasks). See
ray.remote()
for details.