ray.data.extensions.tensor_extension.create_ragged_ndarray#

ray.data.extensions.tensor_extension.create_ragged_ndarray(values: Sequence[numpy.ndarray]) numpy.ndarray[source]#

Create an array that contains arrays of different length

If you’re working with variable-length arrays like images, use this function to create ragged arrays instead of np.array.

Note

np.array fails to construct ragged arrays if the input arrays have a uniform first dimension:

>>> values = [np.zeros((3, 1)), np.zeros((3, 2))]
>>> np.array(values, dtype=object)
Traceback (most recent call last):
    ...
ValueError: could not broadcast input array from shape (3,1) into shape (3,)
>>> create_ragged_ndarray(values)
array([array([[0.],
              [0.],
              [0.]]), array([[0., 0.],
                             [0., 0.],
                             [0., 0.]])], dtype=object)

Or if you’re creating a ragged array from a single array:

>>> values = [np.zeros((3, 1))]
>>> np.array(values, dtype=object)[0].dtype
dtype('O')
>>> create_ragged_ndarray(values)[0].dtype
dtype('float64')

create_ragged_ndarray avoids the limitations of np.array by creating an empty array and filling it with pointers to the variable-length arrays.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.