ray.data.extensions.tensor_extension.ArrowVariableShapedTensorArray#

class ray.data.extensions.tensor_extension.ArrowVariableShapedTensorArray[source]#

Bases: ray.air.util.tensor_extensions.arrow._ArrowTensorScalarIndexingMixin, pyarrow.lib.ExtensionArray

An array of heterogeneous-shaped, homogeneous-typed tensors.

This is the Arrow side of TensorArray for tensor elements that have differing shapes. Note that this extension only supports non-ragged tensor elements; i.e., when considering each tensor element in isolation, they must have a well-defined shape. This extension also only supports tensor elements that all have the same number of dimensions.

See Arrow docs for customizing extension arrays: https://arrow.apache.org/docs/python/extending_types.html#custom-extension-array-class

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

OFFSET_DTYPE#

alias of numpy.int32

classmethod from_numpy(arr: Union[numpy.ndarray, List[numpy.ndarray], Tuple[numpy.ndarray]]) ray.air.util.tensor_extensions.arrow.ArrowVariableShapedTensorArray[source]#

Convert an ndarray or an iterable of heterogeneous-shaped ndarrays to an array of heterogeneous-shaped, homogeneous-typed tensors.

Parameters

arr – An ndarray or an iterable of heterogeneous-shaped ndarrays.

Returns

An ArrowVariableShapedTensorArray containing len(arr) tensors of heterogeneous shape.

to_numpy(zero_copy_only: bool = True)[source]#

Convert the entire array of tensors into a single ndarray.

Parameters

zero_copy_only – If True, an exception will be raised if the conversion to a NumPy array would require copying the underlying data (e.g. in presence of nulls, or for non-primitive types). This argument is currently ignored, so zero-copy isn’t enforced even if this argument is true.

Returns

A single ndarray representing the entire array of tensors.