ray.data.block.BlockAccessor#

class ray.data.block.BlockAccessor[source]#

Provides accessor methods for a specific block.

Ideally, we wouldn’t need a separate accessor classes for blocks. However, this is needed if we want to support storing pyarrow.Table directly as a top-level Ray object, without a wrapping class (issue #17186).

DeveloperAPI: This API may change across minor Ray releases.

Methods

`__init__`
`batch_to_arrow_block`	Create an Arrow block from user-facing data formats.
`batch_to_block`	Create a block from user-facing data formats.
`batch_to_pandas_block`	Create a Pandas block from user-facing data formats.
`block_type`	Return the block type of this block.
`builder`	Create a builder for this block type.
`count`	Returns a count of the distinct values in the provided column
`for_block`	Create a block accessor for the given block.
`get_metadata`	Create a metadata object from this block.
`iter_rows`	Iterate over the rows of this block.
`max`	Returns a max of the values in the provided column
`mean`	Returns a mean of the values in the provided column
`merge_sorted_blocks`	Return a sorted block by merging a list of sorted blocks.
`min`	Returns a min of the values in the provided column
`num_rows`	Return the number of rows contained in this block.
`random_shuffle`	Randomly shuffle this block.
`rename_columns`	Return the block reflecting the renamed columns.
`sample`	Return a random sample of items from this block.
`schema`	Return the Python type or pyarrow schema of this block.
`select`	Return a new block containing the provided columns.
`size_bytes`	Return the approximate size in bytes of this block.
`slice`	Return a slice of this block.
`sort`	Returns new block sorted according to provided `sort_key`
`sort_and_partition`	Return a list of sorted partitions of this block.
`sum`	Returns a sum of the values in the provided column
`sum_of_squared_diffs_from_mean`	Returns a sum of diffs (from mean) squared for the provided column
`take`	Return a new block containing the provided row indices.
`to_arrow`	Convert this block into an Arrow table.
`to_batch_format`	Convert this block into the provided batch format.
`to_block`	Return the base block that this accessor wraps.
`to_default`	Return the default data format for this accessor.
`to_numpy`	Convert this block (or columns of block) into a NumPy ndarray.
`to_pandas`	Convert this block into a Pandas dataframe.
`zip`	Zip this block with another block of the same type and size.