ray.data.block.BlockAccessor#

class ray.data.block.BlockAccessor[source]#

Provides accessor methods for a specific block.

Ideally, we wouldn’t need a separate accessor classes for blocks. However, this is needed if we want to support storing pyarrow.Table directly as a top-level Ray object, without a wrapping class (issue #17186).

DeveloperAPI: This API may change across minor Ray releases.

Methods

__init__

batch_to_arrow_block

Create an Arrow block from user-facing data formats.

batch_to_block

Create a block from user-facing data formats.

batch_to_pandas_block

Create a Pandas block from user-facing data formats.

block_type

Return the block type of this block.

builder

Create a builder for this block type.

count

Returns a count of the distinct values in the provided column

for_block

Create a block accessor for the given block.

get_metadata

Create a metadata object from this block.

iter_rows

Iterate over the rows of this block.

max

Returns a max of the values in the provided column

mean

Returns a mean of the values in the provided column

merge_sorted_blocks

Return a sorted block by merging a list of sorted blocks.

min

Returns a min of the values in the provided column

num_rows

Return the number of rows contained in this block.

random_shuffle

Randomly shuffle this block.

rename_columns

Return the block reflecting the renamed columns.

sample

Return a random sample of items from this block.

schema

Return the Python type or pyarrow schema of this block.

select

Return a new block containing the provided columns.

size_bytes

Return the approximate size in bytes of this block.

slice

Return a slice of this block.

sort

Returns new block sorted according to provided sort_key

sort_and_partition

Return a list of sorted partitions of this block.

sum

Returns a sum of the values in the provided column

sum_of_squared_diffs_from_mean

Returns a sum of diffs (from mean) squared for the provided column

take

Return a new block containing the provided row indices.

to_arrow

Convert this block into an Arrow table.

to_batch_format

Convert this block into the provided batch format.

to_block

Return the base block that this accessor wraps.

to_default

Return the default data format for this accessor.

to_numpy

Convert this block (or columns of block) into a NumPy ndarray.

to_pandas

Convert this block into a Pandas dataframe.

zip

Zip this block with another block of the same type and size.