Data types#

Class#

class ray.data.datatype.DataType(_internal_type: pyarrow.DataType | numpy.dtype | type)[source]#

A simplified Ray Data DataType supporting Arrow, NumPy, and Python types.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

to_arrow_dtype(values: List[Any] | None = None) pyarrow.DataType[source]#

Convert the DataType to a PyArrow DataType.

Parameters:

values – Optional list of values to infer the Arrow type from. Required if the DataType is a Python type.

Returns:

A PyArrow DataType

classmethod from_arrow(arrow_type: pyarrow.DataType) DataType[source]#

Create a DataType from a PyArrow DataType.

Parameters:

arrow_type – A PyArrow DataType to wrap

Returns:

A DataType wrapping the given PyArrow type

Return type:

DataType

Examples

>>> import pyarrow as pa
>>> from ray.data.datatype import DataType
>>> DataType.from_arrow(pa.timestamp('s'))
DataType(arrow:timestamp[s])
>>> DataType.from_arrow(pa.int64())
DataType(arrow:int64)
classmethod from_numpy(numpy_dtype: numpy.dtype | str) DataType[source]#

Create a DataType from a NumPy dtype.

Parameters:

numpy_dtype – A NumPy dtype object or string representation

Returns:

A DataType wrapping the given NumPy dtype

Return type:

DataType

Examples

>>> import numpy as np
>>> from ray.data.datatype import DataType
>>> DataType.from_numpy(np.dtype('int32'))
DataType(numpy:int32)
>>> DataType.from_numpy('float64')
DataType(numpy:float64)
classmethod infer_dtype(value: Any) DataType[source]#

Infer DataType from a Python value, handling numpy, Arrow, and Python types.

Parameters:

value – Any Python value to infer the type from

Returns:

The inferred data type

Return type:

DataType

Examples

>>> import numpy as np
>>> from ray.data.datatype import DataType
>>> DataType.infer_dtype(5)
DataType(arrow:int64)
>>> DataType.infer_dtype("hello")
DataType(arrow:string)
>>> DataType.infer_dtype(np.int32(42))
DataType(numpy:int32)
classmethod binary()#

Create a DataType representing variable-length binary data.

Returns:

A DataType with PyArrow binary type

Return type:

DataType

classmethod bool()#

Create a DataType representing a boolean value.

Returns:

A DataType with PyArrow bool type

Return type:

DataType

classmethod float32()#

Create a DataType representing a 32-bit floating point number.

Returns:

A DataType with PyArrow float32 type

Return type:

DataType

classmethod float64()#

Create a DataType representing a 64-bit floating point number.

Returns:

A DataType with PyArrow float64 type

Return type:

DataType

classmethod int16()#

Create a DataType representing a 16-bit signed integer.

Returns:

A DataType with PyArrow int16 type

Return type:

DataType

classmethod int32()#

Create a DataType representing a 32-bit signed integer.

Returns:

A DataType with PyArrow int32 type

Return type:

DataType

classmethod int64()#

Create a DataType representing a 64-bit signed integer.

Returns:

A DataType with PyArrow int64 type

Return type:

DataType

classmethod int8()#

Create a DataType representing an 8-bit signed integer.

Returns:

A DataType with PyArrow int8 type

Return type:

DataType

classmethod string()#

Create a DataType representing a variable-length string.

Returns:

A DataType with PyArrow string type

Return type:

DataType

classmethod uint16()#

Create a DataType representing a 16-bit unsigned integer.

Returns:

A DataType with PyArrow uint16 type

Return type:

DataType

classmethod uint32()#

Create a DataType representing a 32-bit unsigned integer.

Returns:

A DataType with PyArrow uint32 type

Return type:

DataType

classmethod uint64()#

Create a DataType representing a 64-bit unsigned integer.

Returns:

A DataType with PyArrow uint64 type

Return type:

DataType

classmethod uint8()#

Create a DataType representing an 8-bit unsigned integer.

Returns:

A DataType with PyArrow uint8 type

Return type:

DataType