ray.data.expressions.UDFExpr#

class ray.data.expressions.UDFExpr(data_type: DataType, fn: Callable[..., BatchColumn], args: List[Expr], kwargs: Dict[str, Expr])[source]#

Bases: Expr

Expression that represents a user-defined function call.

This expression type wraps a UDF with schema inference capabilities, allowing UDFs to be used seamlessly within the expression system.

UDFs operate on batches of data, where each column argument is passed as a PyArrow Array containing multiple values from that column across the batch.

Parameters:
  • fn – The user-defined function to call

  • args – List of argument expressions (positional arguments)

  • kwargs – Dictionary of keyword argument expressions

  • function_name – Optional name for the function (for debugging)

Example

>>> from ray.data.expressions import col, udf
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>>
>>> @udf(return_dtype=DataType.int32())
... def add_one(x: pa.Array) -> pa.Array:
...     return pc.add(x, 1)
>>>
>>> # Use in expressions
>>> expr = add_one(col("value"))

DeveloperAPI: This API may change across minor Ray releases.

Methods

is_in

Check if the expression value is in a list of values.

is_not_null

Check if the expression value is not null.

is_null

Check if the expression value is null.

not_in

Check if the expression value is not in a list of values.

Attributes

fn

args

kwargs

data_type