ray.data.expressions.UDFExpr#

class ray.data.expressions.UDFExpr(data_type: DataType, fn: Callable[..., BatchColumn], args: List[Expr], kwargs: Dict[str, Expr])[source]#

Bases: Expr

Expression that represents a user-defined function call.

This expression type wraps a UDF with schema inference capabilities, allowing UDFs to be used seamlessly within the expression system.

UDFs operate on batches of data, where each column argument is passed as a PyArrow Array containing multiple values from that column across the batch.

Parameters:
  • fn – The user-defined function to call

  • args – List of argument expressions (positional arguments)

  • kwargs – Dictionary of keyword argument expressions

  • function_name – Optional name for the function (for debugging)

Example

>>> from ray.data.expressions import col, udf
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>>
>>> @udf(return_dtype=DataType.int32())
... def add_one(x: pa.Array) -> pa.Array:
...     return pc.add(x, 1)
>>>
>>> # Use in expressions
>>> expr = add_one(col("value"))

DeveloperAPI: This API may change across minor Ray releases.

Methods

alias

Rename the expression.

is_in

Check if the expression value is in a list of values.

is_not_null

Check if the expression value is not null.

is_null

Check if the expression value is null.

not_in

Check if the expression value is not in a list of values.

to_pyarrow

Convert this Ray Data expression to a PyArrow compute expression.

Attributes

list

Access list operations for this expression.

name

Get the name associated with this expression.

str

Access string operations for this expression.

struct

Access struct operations for this expression.

fn

args

kwargs

data_type