# Working with Tensors / NumPy#

N-dimensional arrays (in other words, tensors) are ubiquitous in ML workloads. This guide describes the limitations and best practices of working with such data.

## Tensor data representation#

Ray Data represents tensors as NumPy ndarrays.

```import ray

print(ds)
```
```Dataset(
num_rows=100,
schema={image: numpy.ndarray(shape=(28, 28), dtype=uint8)}
)
```

### Batches of fixed-shape tensors#

If your tensors have a fixed shape, Ray Data represents batches as regular ndarrays.

```>>> import ray
>>> batch = ds.take_batch(batch_size=32)
>>> batch["image"].shape
(32, 28, 28)
>>> batch["image"].dtype
dtype('uint8')
```

### Batches of variable-shape tensors#

If your tensors vary in shape, Ray Data represents batches as arrays of object dtype.

```>>> import ray
>>> batch = ds.take_batch(batch_size=32)
>>> batch["image"].shape
(32,)
>>> batch["image"].dtype
dtype('O')
```

The individual elements of these object arrays are regular ndarrays.

```>>> batch["image"][0].dtype
dtype('uint8')
>>> batch["image"][0].shape
(375, 500, 3)
>>> batch["image"][3].shape
(333, 465, 3)
```

## Transforming tensor data#

Call `map()` or `map_batches()` to transform tensor data.

```from typing import Any, Dict

import ray
import numpy as np

def increase_brightness(row: Dict[str, Any]) -> Dict[str, Any]:
row["image"] = np.clip(row["image"] + 4, 0, 255)
return row

# Increase the brightness, record at a time.
ds.map(increase_brightness)

def batch_increase_brightness(batch: Dict[str, np.ndarray]) -> Dict:
batch["image"] = np.clip(batch["image"] + 4, 0, 255)
return batch

# Increase the brightness, batch at a time.
ds.map_batches(batch_increase_brightness)
```

In addition to NumPy ndarrays, Ray Data also treats returned lists of NumPy ndarrays and objects implementing `__array__` (for example, `torch.Tensor`) as tensor data.

## Saving tensor data#

Save tensor data with formats like Parquet, NumPy, and JSON. To view all supported formats, see the Input/Output reference.

Call `write_parquet()` to save data in Parquet files.

```import ray

ds.write_parquet("/tmp/simple")
```

Call `write_numpy()` to save an ndarray column in NumPy files.

```import ray

To save images in a JSON file, call `write_json()`.
```import ray