ray.data.block.BlockAccessor#

class ray.data.block.BlockAccessor[source]#

Provides accessor methods for a specific block.

Ideally, we wouldn’t need a separate accessor classes for blocks. However, this is needed if we want to support storing pyarrow.Table directly as a top-level Ray object, without a wrapping class (issue #17186).

DeveloperAPI: This API may change across minor Ray releases.

Methods

__init__

aggregate_combined_blocks

Aggregate partially combined and sorted blocks.

batch_to_block

Create a block from user-facing data formats.

builder

Create a builder for this block type.

combine

Combine rows with the same key into an accumulator.

for_block

Create a block accessor for the given block.

get_metadata

Create a metadata object from this block.

iter_rows

Iterate over the rows of this block.

merge_sorted_blocks

Return a sorted block by merging a list of sorted blocks.

num_rows

Return the number of rows contained in this block.

random_shuffle

Randomly shuffle this block.

sample

Return a random sample of items from this block.

schema

Return the Python type or pyarrow schema of this block.

select

Return a new block containing the provided columns.

size_bytes

Return the approximate size in bytes of this block.

slice

Return a slice of this block.

sort_and_partition

Return a list of sorted partitions of this block.

take

Return a new block containing the provided row indices.

to_arrow

Convert this block into an Arrow table.

to_batch_format

Convert this block into the provided batch format.

to_block

Return the base block that this accessor wraps.

to_default

Return the default data format for this accessor.

to_numpy

Convert this block (or columns of block) into a NumPy ndarray.

to_pandas

Convert this block into a Pandas dataframe.

zip

Zip this block with another block of the same type and size.