ray.data.expressions.UDFExpr#

class ray.data.expressions.UDFExpr(data_type: DataType, fn: Callable[..., BatchColumn], args: List[Expr], kwargs: Dict[str, Expr])[source]#

Bases: Expr

Expression that represents a user-defined function call.

This expression type wraps a UDF with schema inference capabilities, allowing UDFs to be used seamlessly within the expression system.

UDFs operate on batches of data, where each column argument is passed as a PyArrow Array containing multiple values from that column across the batch.

Parameters:
  • fn – The user-defined function to call. For callable classes, this is an _CallableClassUDF instance that handles lazy instantiation internally.

  • args – List of argument expressions (positional arguments)

  • kwargs – Dictionary of keyword argument expressions

Example

>>> from ray.data.expressions import col, udf
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> from ray.data.datatype import DataType
>>>
>>> @udf(return_dtype=DataType.int32())
... def add_one(x: pa.Array) -> pa.Array:
...     return pc.add(x, 1)
>>>
>>> # Use in expressions
>>> expr = add_one(col("value"))
>>> # Callable class example
>>> @udf(return_dtype=DataType.int32())
... class AddOffset:
...     def __init__(self, offset=1):
...         self.offset = offset
...     def __call__(self, x: pa.Array) -> pa.Array:
...         return pc.add(x, self.offset)
>>>
>>> # Use callable class
>>> add_five = AddOffset(5)
>>> expr = add_five(col("value"))

DeveloperAPI: This API may change across minor Ray releases.

Methods

abs

Compute the absolute value of the expression.

alias

Rename the expression.

ceil

Round values up to the nearest integer.

exp

Compute the natural exponential of the expression.

floor

Round values down to the nearest integer.

is_in

Check if the expression value is in a list of values.

is_not_null

Check if the expression value is not null.

is_null

Check if the expression value is null.

ln

Compute the natural logarithm of the expression.

log10

Compute the base-10 logarithm of the expression.

log2

Compute the base-2 logarithm of the expression.

negate

Compute the negation of the expression.

not_in

Check if the expression value is not in a list of values.

power

Raise the expression to the given power.

round

Round values to the nearest integer using PyArrow semantics.

sign

Compute the sign of the expression.

to_pyarrow

Convert this Ray Data expression to a PyArrow compute expression.

trunc

Truncate fractional values toward zero.

Attributes

callable_class_spec

Return callable_class_spec if fn is an _CallableClassUDF, else None.

dt

Access datetime operations for this expression.

list

Access list operations for this expression.

name

Get the name associated with this expression.

str

Access string operations for this expression.

struct

Access struct operations for this expression.

fn

args

kwargs

data_type