NDArray

The multidimensional data array class. Instances may be constructed using the constructor functions in the list below NDArrayConstructors. In addition, all the functions from the Lazy Functions section can be used with NDArray instances.

class blosc2.NDArray(**kwargs)[source]
Attributes:
T

Return the transpose of a 2-dimensional array.

blocks

The block shape of this container.

blocksize

The block size (in bytes) for this container.

cbytes

The number of compressed bytes used by the array.

chunks

Returns the data chunk shape of this container.

chunksize

Returns the data chunk size (in bytes) for this container.

cparams

The compression parameters used by the array.

cratio

The compression ratio of the array.

dparams

The decompression parameters used by the array.

dtype

Data-type of the array’s elements.

ext_chunks

Returns the padded chunk shape which defines the chunksize in the associated schunk.

ext_shape

The padded data shape.

fields

Dictionary with the fields of the structured array.

info

Print information about this array.

info_items

A list of tuples with the information about this array.

keep_last_read

Indicates whether the last read data should be kept in memory.

meta

The metadata of the array.

nbytes

The number of bytes used by the array.

ndim

The number of dimensions of this container.

oindex

Shortcut for orthogonal (outer) indexing, see get_oselection_numpy()

schunk

The SChunk reference of the NDArray.

shape

Returns the data shape of this container.

size

The size (in bytes) for this container.

urlpath

The URL path of the array.

vlmeta

The variable-length metadata of the array.

Methods

all([axis, keepdims])

Test whether all array elements along a given axis evaluate to True.

any([axis, keepdims])

Test whether any array element along a given axis evaluates to True.

copy([dtype])

Create a copy of an array with different parameters.

get_chunk(nchunk)

Shortcut to SChunk.get_chunk.

get_fselection_numpy(key)

Select a slice from the array using a fancy index.

get_oselection_numpy(key)

Select independently from self along axes specified in key.

indices([order])

Return the indices of a sorted array following the specified order.

iterchunks_info()

Iterate over self chunks of the array, providing information on index and special values.

max([axis, keepdims])

Return the maximum along a given axis.

mean([axis, dtype, keepdims])

Return the arithmetic mean along the specified axis.

min([axis, keepdims])

Return the minimum along a given axis.

prod([axis, dtype, keepdims])

Return the product of array elements over a given axis.

reshape(shape, **kwargs)

Return a new array with the specified shape.

resize(newshape)

Change the shape of the array by growing or shrinking one or more dimensions.

save(urlpath[, contiguous])

Save the array to a file.

set_oselection_numpy(key, arr)

Select independently from self along axes specified in key and set to entries in arr.

slice(key, **kwargs)

Get a (multidimensional) slice as a new NDArray.

sort([order])

Return a sorted array following the specified order, or the order of the fields.

squeeze([mask])

Remove single-dimensional entries from the shape of the array.

std([axis, dtype, ddof, keepdims])

Return the standard deviation along the specified axis.

sum([axis, dtype, keepdims])

Return the sum of array elements over a given axis.

to_cframe()

Get a bytes object containing the serialized NDArray instance.

tobytes()

Returns a buffer containing the data of the entire array.

var([axis, dtype, ddof, keepdims])

Return the variance along the specified axis.

Special Methods:

__iter__()

Iterate over the (outer) elements of the array.

__len__()

Returns the length of the first dimension of the array.

__getitem__(key)

Retrieve a (multidimensional) slice as specified by the key.

__setitem__(key, value)

Set a slice of the array.

Utility Methods

__iter__()[source]

Iterate over the (outer) elements of the array.

Returns:

out

Return type:

iterator

__len__() int[source]

Returns the length of the first dimension of the array. This is equivalent to self.shape[0].

__getitem__(key: int | slice | Sequence[slice | int] | np.ndarray[np.bool_] | NDArray | blosc2.LazyExpr | str) np.ndarray | blosc2.LazyExpr[source]

Retrieve a (multidimensional) slice as specified by the key.

Note that this __getitem__ closely matches NumPy fancy indexing behaviour, except in some edge cases which are not supported by ndindex. Array indices separated by slice object - e.g. arr[0, :10, [0,1]] - are NOT supported. See https://www.blosc.org/posts/blosc2-fancy-indexing for more details.

Parameters:

key (int, slice, sequence of (slices, int), array of bools, LazyExpr or str) – The slice(s) to be retrieved. Note that step parameter is not yet honored in slices. If a LazyExpr is provided, the expression is expected to be of boolean type, and the result will be another LazyExpr returning the values of this array where the expression is True. When key is a (nd-)array of bools, the result will be the values of self where the bool values are True (similar to NumPy). If key is a 1-dim sequence of integers, the result will be the values of this array at the specified indices. N-dim indices are not yet supported. If the key is a string, and it is a field name of self, a NDField accessor will be returned; if not, it will be attempted to convert to a LazyExpr, and will search for its operands in the fields of self.

Returns:

out – The requested data as a NumPy array or a LazyExpr.

Return type:

np.ndarray | blosc2.LazyExpr

Examples

>>> import blosc2
>>> shape = [25, 10]
>>> # Create an array
>>> a = blosc2.full(shape, 3.3333)
>>> # Get slice as a NumPy array
>>> a[:5, :5]
array([[3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333]])
__setitem__(key: int | slice | Sequence[slice], value: object)[source]

Set a slice of the array.

Parameters:
  • key (int, slice or sequence of slices) – The index or indices specifying the slice(s) to be updated. Note that the step parameter is not yet supported.

  • value (Py_Object Supporting the Buffer Protocol) – An object supporting the Buffer Protocol which will be used to overwrite the specified slice(s).

Examples

>>> import blosc2
>>> # Create an array
>>> a = blosc2.full([8, 8], 3.3333)
>>> # Set a slice to 0
>>> a[:5, :5] = 0
>>> a[:]
array([[0.    , 0.    , 0.    , 0.    , 0.    , 3.3333, 3.3333, 3.3333],
       [0.    , 0.    , 0.    , 0.    , 0.    , 3.3333, 3.3333, 3.3333],
       [0.    , 0.    , 0.    , 0.    , 0.    , 3.3333, 3.3333, 3.3333],
       [0.    , 0.    , 0.    , 0.    , 0.    , 3.3333, 3.3333, 3.3333],
       [0.    , 0.    , 0.    , 0.    , 0.    , 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333],
       [3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333, 3.3333]])
all(axis=None, keepdims=False, **kwargs)[source]

Test whether all array elements along a given axis evaluate to True.

The parameters are documented in the min.

Returns:

all_along_axis – The result of the evaluation along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.all

Examples

>>> import numpy as np
>>> import blosc2
>>> data = np.array([True, True, False, True, True, True])
>>> ndarray = blosc2.asarray(data)
>>> # Test if all elements are True along the default axis (flattened array)
>>> result_flat = blosc2.all(ndarray)
>>> print("All elements are True (flattened):", result_flat)
All elements are True (flattened): False
any(axis=None, keepdims=False, **kwargs)[source]

Test whether any array element along a given axis evaluates to True.

The parameters are documented in the min.

Returns:

any_along_axis – The result of the evaluation along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.any

Examples

>>> import blosc2
>>> import numpy as np
>>> data = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 0]])
>>> # Convert the NumPy array to a Blosc2 NDArray
>>> ndarray = blosc2.asarray(data)
>>> print("NDArray data:", ndarray[:])
NDArray data: [[1 0 0]
                [0 1 0]
                [0 0 0]]
>>> any_along_axis_0 = blosc2.any(ndarray, axis=0)
>>> print("Any along axis 0:", any_along_axis_0)
Any along axis 0: [True True False]
>>> any_flattened = blosc2.any(ndarray)
>>> print("Any in the flattened array:", any_flattened)
Any in the flattened array: True
copy(dtype: dtype | str = None, **kwargs: Any) NDArray[source]

Create a copy of an array with different parameters.

Parameters:
  • dtype (np.dtype or list str) – The new array dtype. Default is self.dtype.

  • kwargs (dict, optional) – Additional keyword arguments supported by the empty() constructor. If not specified, the defaults will be taken from the original array (except for the urlpath).

Returns:

out – A NDArray with a copy of the data.

Return type:

NDArray

See also

copy()

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = (10, 10)
>>> blocks = (10, 10)
>>> dtype = np.bool_
>>> # Create a NDArray with default chunks
>>> a = blosc2.zeros(shape, blocks=blocks, dtype=dtype)
>>> # Get a copy with default chunks and blocks
>>> b = a.copy(chunks=None, blocks=None)
>>> np.array_equal(b[...], a[...])
True
get_chunk(nchunk: int) bytes[source]

Shortcut to SChunk.get_chunk. This can be accessed through the schunk attribute as well.

Parameters:

nchunk (int) – The index of the chunk to retrieve.

Returns:

chunk – The chunk data at the specified index.

Return type:

bytes

See also

schunk

The attribute that provides access to the underlying SChunk object.

Examples

>>> import blosc2
>>> import numpy as np
>>> # Create an SChunk with some data
>>> array = np.arange(10)
>>> ndarray = blosc2.asarray(array)
>>> chunk = ndarray.get_chunk(0)
>>> # Decompress the chunk to convert it into a numpy array
>>> decompressed_chunk = blosc2.decompress(chunk)
>>> np_array_chunk = np.frombuffer(decompressed_chunk, dtype=np.int64)
>>> # Verify the content of the chunk
>>> if isinstance(np_array_chunk, np.ndarray):
>>>         print(np_array_chunk)
>>>         print(np_array_chunk.shape) # Assuming chunk is a list or numpy array
[ 0  1  2  3  4  5  6  7  8  9]
(10,)
get_fselection_numpy(key: list | ndarray) ndarray[source]

Select a slice from the array using a fancy index. Closely matches NumPy fancy indexing behaviour, except in some edge cases which are not supported by ndindex. Array indices separated by slice object - e.g. arr[0, :10, [0,1]] - are NOT supported. See https://www.blosc.org/posts/blosc2-fancy-indexing for more details.

Parameters:

key (list or np.ndarray)

Returns:

out

Return type:

np.ndarray

get_oselection_numpy(key: list | ndarray) ndarray[source]

Select independently from self along axes specified in key. Key must be same length as self shape. See Zarr https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#orthogonal-indexing.

indices(order: str | list[str] | None = None, **kwargs: Any) NDArray[source]

Return the indices of a sorted array following the specified order.

This is only valid for 1-dim structured arrays.

See full documentation in indices().

iterchunks_info() Iterator[NamedTuple('info', nchunk=int, coords=tuple, cratio=float, special=blosc2.SpecialValue, repeated_value=bytes | None, lazychunk=bytes)][source]

Iterate over self chunks of the array, providing information on index and special values.

Yields:

info (namedtuple) –

A namedtuple with the following fields:

nchunk: int

The index of the chunk.

coords: tuple

The coordinates of the chunk, in chunk units.

cratio: float

The compression ratio of the chunk.

special: SpecialValue

The special value enum of the chunk; if 0, the chunk is not special.

repeated_value: self.dtype or None

The repeated value for the chunk; if not SpecialValue.VALUE, it is None.

lazychunk: bytes

A buffer containing the complete lazy chunk.

Examples

>>> import blosc2
>>> a = blosc2.full(shape=(1000, ) * 3, fill_value=9, chunks=(500, ) * 3, dtype="f4")
>>> for info in a.iterchunks_info():
...     print(info.coords)
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(0, 1, 1)
(1, 0, 0)
(1, 0, 1)
(1, 1, 0)
(1, 1, 1)
max(axis=None, keepdims=False, **kwargs)[source]

Return the maximum along a given axis.

The parameters are documented in the min.

Returns:

max_along_axis – The maximum of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.max

Examples

>>> import blosc2
>>> import numpy as np
>>> data = np.array([[11, 2, 36, 24, 5, 69], [73, 81, 49, 6, 73, 0]])
>>> ndarray = blosc2.asarray(data)
>>> print("NDArray data:", ndarray[:])
NDArray data:  [[11  2 36 24  5 69]
                [73 81 49  6 73  0]]
>>> # Compute the maximum along axis 0 and 1
>>> max_along_axis_0 = blosc2.max(ndarray, axis=0)
>>> print("Maximum along axis 0:", max_along_axis_0)
Maximum along axis 0: [73 81 49 24 73 69]
>>> max_along_axis_1 = blosc2.max(ndarray, axis=1)
>>> print("Maximum along axis 1:", max_along_axis_1)
Maximum along axis 1: [69 81]
>>> max_flattened = blosc2.max(ndarray)
>>> print("Maximum of the flattened array:", max_flattened)
Maximum of the flattened array: 81
mean(axis=None, dtype=None, keepdims=False, **kwargs)[source]

Return the arithmetic mean along the specified axis.

The parameters are documented in the sum.

Returns:

mean_along_axis – The mean of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.mean

Examples

>>> import numpy as np
>>> import blosc2
>>> # Example array
>>> array = np.array([[1, 2, 3], [4, 5, 6]]
>>> nd_array = blosc2.asarray(array)
>>> # Compute the mean of all elements in the array (axis=None)
>>> overall_mean = blosc2.mean(nd_array)
>>> print("Mean of all elements:", overall_mean)
Mean of all elements: 3.5
min(axis=None, keepdims=False, **kwargs)[source]

Return the minimum along a given axis.

Parameters:
  • ndarr (NDArray or NDField or C2Array or LazyExpr) – The input array or expression.

  • axis (int or tuple of ints, optional) – Axis or axes along which to operate. By default, flattened input is used.

  • keepdims (bool, optional) – If set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

min_along_axis – The minimum of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.min

Examples

>>> import numpy as np
>>> import blosc2
>>> array = np.array([1, 3, 7, 8, 9, 31])
>>> nd_array = blosc2.asarray(array)
>>> min_all = blosc2.min(nd_array)
>>> print("Minimum of all elements in the array:", min_all)
Minimum of all elements in the array: 1
>>> # Compute the minimum along axis 0 with keepdims=True
>>> min_keepdims = blosc2.min(nd_array, axis=0, keepdims=True)
>>> print("Minimum along axis 0 with keepdims=True:", min_keepdims)
Minimum along axis 0 with keepdims=True:  [1]
prod(axis=None, dtype=None, keepdims=False, **kwargs)[source]

Return the product of array elements over a given axis.

The parameters are documented in the sum.

Returns:

product_along_axis – The product of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.prod

Examples

>>> import numpy as np
>>> import blosc2
>>> # Create an instance of NDArray with some data
>>> array = np.array([[11, 22, 33], [4, 15, 36]])
>>> nd_array = blosc2.asarray(array)
>>> # Compute the product of all elements in the array
>>> prod_all = blosc2.prod(nd_array)
>>> print("Product of all elements in the array:", prod_all)
Product of all elements in the array: 17249760
>>> # Compute the product along axis 1 (rows)
>>> prod_axis1 = blosc2.prod(nd_array, axis=1)
>>> print("Product along axis 1:", prod_axis1)
Product along axis 1: [7986 2160]
reshape(shape: tuple[int], **kwargs: Any) NDArray[source]

Return a new array with the specified shape.

See full documentation in reshape().

See also

reshape()

resize(newshape: tuple | list) None[source]

Change the shape of the array by growing or shrinking one or more dimensions.

Parameters:

newshape (tuple or list) – The new shape of the array. It should have the same number of dimensions as self, the current shape.

Returns:

out

Return type:

None

Notes

The array values in the newly added positions are not initialized. The user is responsible for initializing them.

Examples

>>> import blosc2
>>> import numpy as np
>>> import math
>>> dtype = np.dtype(np.float32)
>>> shape = [23, 11]
>>> a = np.linspace(1, 3, num=math.prod(shape)).reshape(shape)
>>> # Create an array
>>> b = blosc2.asarray(a)
>>> newshape = [50, 10]
>>> # Extend first dimension, shrink second dimension
>>> b.resize(newshape)
>>> b.shape
(50, 10)
save(urlpath: str, contiguous=True, **kwargs: Any) None[source]

Save the array to a file.

This is a convenience function that calls the copy() method with the urlpath parameter and the additional keyword arguments provided.

See save() for more information.

Parameters:
  • urlpath (str) – The path where the array will be saved.

  • contiguous (bool, optional) – Whether to save the array contiguously.

  • kwargs (dict, optional) – Additional keyword arguments supported by the save() method.

Returns:

out

Return type:

None

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = (10, 10)
>>> blocks = (10, 10)
>>> dtype = np.bool_
>>> # Create a NDArray with default chunks
>>> a = blosc2.zeros(shape, blocks=blocks, dtype=dtype)
>>> # Save the array to a file
>>> a.save("array.b2frame")
set_oselection_numpy(key: list | ndarray, arr: NDArray) ndarray[source]

Select independently from self along axes specified in key and set to entries in arr. Key must be same length as self shape. See Zarr https://zarr.readthedocs.io/en/stable/user-guide/arrays.html#orthogonal-indexing.

slice(key: int | slice | Sequence[slice], **kwargs: Any) NDArray[source]

Get a (multidimensional) slice as a new NDArray.

Parameters:
  • key (int, slice or sequence of slices) – The index for the slices to be retrieved. Note that the step parameter is not yet supported in slices.

  • kwargs (dict, optional) – Additional keyword arguments supported by the empty() constructor.

Returns:

out – An array containing the requested data. The dtype will match that of self.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [23, 11]
>>> a = np.arange(np.prod(shape)).reshape(shape)
>>> # Create an array
>>> b = blosc2.asarray(a)
>>> slices = (slice(3, 7), slice(1, 11))
>>> # Get a slice as a new NDArray
>>> c = b.slice(slices)
>>> print(c.shape)
(4, 10)
>>> print(type(c))
<class 'blosc2.ndarray.NDArray'>

Notes

There is a fast path for slices that are aligned with underlying chunks. Aligned means that the slices are made entirely with complete chunks.

sort(order: str | list[str] | None = None, **kwargs: Any) NDArray[source]

Return a sorted array following the specified order, or the order of the fields.

This is only valid for 1-dim structured arrays.

See full documentation in sort().

squeeze(mask=None) NDArray[source]

Remove single-dimensional entries from the shape of the array.

This method modifies the array in-place. If mask is None removes any dimensions with size 1. If mask is provided, it should be a boolean array of the same shape as the array, and the corresponding dimensions (of size 1) will be removed.

Returns:

out

Return type:

NDArray

Examples

>>> import blosc2
>>> shape = [1, 23, 1, 11, 1]
>>> # Create an array
>>> a = blosc2.full(shape, 2**30)
>>> a.shape
(1, 23, 1, 11, 1)
>>> # Squeeze the array
>>> a.squeeze()
>>> a.shape
(23, 11)
std(axis=None, dtype=None, ddof=0, keepdims=False, **kwargs)[source]

Return the standard deviation along the specified axis.

Parameters:
  • ndarr (NDArray or NDField or C2Array or LazyExpr) – The input array or expression.

  • axis (int or tuple of ints, optional) – Axis or axes along which the standard deviation is computed. By default, axis=None computes the standard deviation of the flattened array.

  • dtype (np.dtype or list str, optional) – Type to use in computing the standard deviation. For integer inputs, the default is float32; for floating point inputs, it is the same as the input dtype.

  • ddof (int, optional) – Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default, ddof is zero.

  • keepdims (bool, optional) – If set to True, the reduced axes are left in the result as dimensions with size one. This ensures that the result will broadcast correctly against the input array.

  • kwargs (dict, optional) – Additional keyword arguments that are supported by the empty() constructor.

Returns:

std_along_axis – The standard deviation of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.std

Examples

>>> import numpy as np
>>> import blosc2
>>> # Create an instance of NDArray with some data
>>> array = np.array([[1, 2, 3], [4, 5, 6]])
>>> nd_array = blosc2.asarray(array)
>>> # Compute the standard deviation of the entire array
>>> std_all = blosc2.std(nd_array)
>>> print("Standard deviation of the entire array:", std_all)
Standard deviation of the entire array: 1.707825127659933
>>> # Compute the standard deviation along axis 0 (columns)
>>> std_axis0 = blosc2.std(nd_array, axis=0)
>>> print("Standard deviation along axis 0:", std_axis0)
Standard deviation along axis 0: [1.5 1.5 1.5]
sum(axis=None, dtype=None, keepdims=False, **kwargs)[source]

Return the sum of array elements over a given axis.

Parameters:
  • ndarr (NDArray or NDField or C2Array or LazyExpr) – The input array or expression.

  • axis (int or tuple of ints, optional) – Axis or axes along which a sum is performed. By default, axis=None, sums all the elements of the input array. If axis is negative, it counts from the last to the first axis.

  • dtype (np.dtype or list str, optional) – The type of the returned array and of the accumulator in which the elements are summed. The dtype of ndarr is used by default unless it has an integer dtype of less precision than the default platform integer.

  • keepdims (bool, optional) – If set to True, the reduced axes are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

  • kwargs (dict, optional) – Additional keyword arguments supported by the empty() constructor.

Returns:

sum_along_axis – The sum of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.sum

Examples

>>> import numpy as np
>>> import blosc2
>>> # Example array
>>> array = np.array([[1, 2, 3], [4, 5, 6]])
>>> nd_array = blosc2.asarray(array)
>>> # Sum all elements in the array (axis=None)
>>> total_sum = blosc2.sum(nd_array)
>>> print("Sum of all elements:", total_sum)
21
>>> # Sum along axis 0 (columns)
>>> sum_axis_0 = blosc2.sum(nd_array, axis=0)
>>> print("Sum along axis 0 (columns):", sum_axis_0)
Sum along axis 0 (columns): [5 7 9]
to_cframe() bytes[source]

Get a bytes object containing the serialized NDArray instance.

Returns:

out – The buffer containing the serialized NDArray instance.

Return type:

bytes

See also

ndarray_from_cframe()

This function can be used to reconstruct a NDArray from the serialized bytes.

Examples

>>> import blosc2
>>> a = blosc2.full(shape=(1000, 1000), fill_value=9, dtype='i4')
>>> # Get the bytes object containing the serialized instance
>>> cframe_bytes = a.to_cframe()
>>> blosc_array = blosc2.ndarray_from_cframe(cframe_bytes)
>>> print("Shape of the NDArray:", blosc_array.shape)
>>> print("Data type of the NDArray:", blosc_array.dtype)
Shape of the NDArray: (1000, 1000)
Data type of the NDArray: int32
tobytes() bytes[source]

Returns a buffer containing the data of the entire array.

Returns:

out – The buffer with the data of the whole array.

Return type:

bytes

Examples

>>> import blosc2
>>> import numpy as np
>>> dtype = np.dtype("i4")
>>> shape = [23, 11]
>>> a = np.arange(0, int(np.prod(shape)), dtype=dtype).reshape(shape)
>>> # Create an array
>>> b = blosc2.asarray(a)
>>> b.tobytes() == bytes(a[...])
True
var(axis=None, dtype=None, ddof=0, keepdims=False, **kwargs)[source]

Return the variance along the specified axis.

The parameters are documented in the std.

Returns:

var_along_axis – The variance of the elements along the axis.

Return type:

np.ndarray or NDArray or scalar

References

np.var

Examples

>>> import numpy as np
>>> import blosc2
>>> # Create an instance of NDArray with some data
>>> array = np.array([[1, 2, 3], [4, 5, 6]])
>>> nd_array = blosc2.asarray(array)
>>> # Compute the variance of the entire array
>>> var_all = blosc2.var(nd_array)
>>> print("Variance of the entire array:", var_all)
Variance of the entire array: 2.9166666666666665
>>> # Compute the variance along axis 0 (columns)
>>> var_axis0 = blosc2.var(nd_array, axis=0)
>>> print("Variance along axis 0:", var_axis0)
Variance along axis 0: [2.25 2.25 2.25]
property T

Return the transpose of a 2-dimensional array.

property blocks: tuple[int]

The block shape of this container.

property blocksize: int

The block size (in bytes) for this container.

This is a shortcut to SChunk.blocksize and can be accessed through the schunk attribute as well.

See also

schunk

Examples

>>> import blosc2
>>> import numpy as np
>>> array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> ndarray = blosc2.asarray(array)
>>> print("Block size:", ndarray.blocksize)
Block size: 80
property cbytes: int

The number of compressed bytes used by the array.

property chunks: tuple[int]

Returns the data chunk shape of this container.

If the chunk shape is a multiple of each dimension of blocks, it will be the same as ext_chunks.

See also

ext_chunks

property chunksize: int

Returns the data chunk size (in bytes) for this container.

This will not be the same as SChunk.chunksize in case chunks is not multiple in each dimension of blocks (or equivalently, if chunks is not the same as ext_chunks).

See also

chunks, ext_chunks

property cparams: CParams

The compression parameters used by the array.

property cratio: float

The compression ratio of the array.

property dparams: DParams

The decompression parameters used by the array.

property dtype: dtype

Data-type of the array’s elements.

property ext_chunks: tuple[int]

Returns the padded chunk shape which defines the chunksize in the associated schunk.

This will be the chunk shape used to store each chunk, filling the extra positions with zeros (padding). If the chunks is a multiple of each dimension of blocks it will be the same as chunks.

See also

chunks

property ext_shape: tuple[int]

The padded data shape.

The padded data is filled with zeros to make the real data fit into blocks and chunks, but it will never be retrieved as actual data (so the user can ignore this). In case shape is multiple in each dimension of chunks it will be the same as shape.

See also

shape, chunks

property fields: dict

Dictionary with the fields of the structured array.

Returns:

fields – A dictionary with the fields of the structured array.

Return type:

dict

See also

NDField

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = (10,)
>>> dtype = np.dtype([('a', np.int32), ('b', np.float64)])
>>> # Create a structured array
>>> sa = blosc2.zeros(shape, dtype=dtype)
>>> # Check that fields are equal
>>> assert sa.fields['a'] == sa.fields['b']
property info: InfoReporter

Print information about this array.

Examples

>>> import numpy as np
>>> import blosc2
>>> my_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> array = blosc2.asarray(my_array)
>>> print(array.info)
type    : NDArray
shape   : (10,)
chunks  : (10,)
blocks  : (10,)
dtype   : int64
cratio  : 0.73
cparams : {'blocksize': 80,
'clevel': 1,
'codec': <Codec.ZSTD: 5>,
'codec_meta': 0,
'filters': [<Filter.NOFILTER: 0>,
        <Filter.NOFILTER: 0>,
        <Filter.NOFILTER: 0>,
        <Filter.NOFILTER: 0>,
        <Filter.NOFILTER: 0>,
        <Filter.SHUFFLE: 1>],
'filters_meta': [0, 0, 0, 0, 0, 0],
'nthreads': 4,
'splitmode': <SplitMode.ALWAYS_SPLIT: 1>,
'typesize': 8,
'use_dict': 0}
dparams : {'nthreads': 4}
property info_items: list

A list of tuples with the information about this array. Each tuple contains the name of the attribute and its value.

property keep_last_read: bool

Indicates whether the last read data should be kept in memory.

property meta: dict

The metadata of the array.

property nbytes: int

The number of bytes used by the array.

property ndim: int

The number of dimensions of this container.

property oindex: OIndex

Shortcut for orthogonal (outer) indexing, see get_oselection_numpy()

property schunk: SChunk

The SChunk reference of the NDArray. All the attributes from the SChunk can be accessed through this instance as self.schunk.

See also

SChunk Attributes

property shape: tuple[int]

Returns the data shape of this container.

If the shape is a multiple of each dimension of chunks, it will be the same as ext_shape.

See also

ext_shape

property size: int

The size (in bytes) for this container.

property urlpath: str

The URL path of the array.

property vlmeta: dict

The variable-length metadata of the array.

Constructors

arange([start, stop, step, dtype, shape, ...])

Return evenly spaced values within a given interval.

asarray(array, **kwargs)

Convert the array to an NDArray.

concat(arrays, /[, axis])

Concatenate a list of arrays along a specified axis.

copy(array[, dtype])

This is equivalent to NDArray.copy()

empty(shape[, dtype])

Create an empty array.

expand_dims(array[, axis])

Expand the shape of an array by adding new axes at the specified positions.

eye(N[, M, k, dtype])

Return a 2-D array with ones on the diagonal and zeros elsewhere.

frombuffer(buffer, shape[, dtype])

Create an array out of a buffer.

fromiter(iterable, shape, dtype[, c_order])

Create a new array from an iterable object.

full(shape, fill_value[, dtype])

Create an array, with fill_value being used as the default value for uninitialized portions of the array.

linspace(start, stop[, num, endpoint, ...])

Return evenly spaced numbers over a specified interval.

nans(shape[, dtype])

Create an array with NaNs values.

ndarray_from_cframe(cframe[, copy])

Create a NDArray instance from a contiguous frame buffer.

ones(shape[, dtype])

Create an array with one as values.

reshape(src, shape[, c_order])

Returns an array containing the same data with a new shape.

stack(arrays[, axis])

Stack multiple arrays, creating a new axis.

uninit(shape[, dtype])

Create an array with uninitialized values.

zeros(shape[, dtype])

Create an array with zero as the default value for uninitialized portions of the array.

blosc2.arange(start: int | float = 0, stop: int | float | None = None, step: int | float | None = 1, dtype: ~numpy.dtype | str = <class 'numpy.int64'>, shape: int | tuple | list | None = None, c_order: bool = True, **kwargs: ~typing.Any) NDArray[source]

Return evenly spaced values within a given interval.

Parameters:
  • start (int, float, complex or np.number) – The starting value of the sequence.

  • stop (int, float, complex or np.number) – The end value of the sequence.

  • step (int, float, complex or np.number) – Spacing between values.

  • dtype (np.dtype or list str) – The data type of the array elements in NumPy format. Default is np.uint8. This will override the typesize in the compression parameters if they are provided.

  • shape (int, tuple or list) – The shape of the final array. If None, the shape will be computed.

  • c_order (bool) – Whether to store the array in C order (row-major) or insertion order. Insertion order means that values will be stored in the array following the order of chunks in the array; this is more memory efficient, as it does not require an intermediate copy of the array. Default is C order.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> # Create an array with values from 0 to 10
>>> array = blosc2.arange(0, 10, 1)
>>> print(array)
[0 1 2 3 4 5 6 7 8 9]
blosc2.asarray(array: ndarray | C2Array, **kwargs: Any) NDArray[source]

Convert the array to an NDArray.

Parameters:
  • array (array_like) – An array supporting numpy array interface.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – An new NDArray made of array.

Return type:

NDArray

Notes

This will create the NDArray chunk-by-chunk directly from the input array, without the need to create a contiguous NumPy array internally. This can be used for ingesting e.g. disk or network based arrays very effectively and without consuming lots of memory.

Examples

>>> import blosc2
>>> import numpy as np
>>> # Create some data
>>> shape = [25, 10]
>>> a = np.arange(0, np.prod(shape), dtype=np.int64).reshape(shape)
>>> # Create a NDArray from a NumPy array
>>> nda = blosc2.asarray(a)
blosc2.concat(arrays: list[NDArray], /, axis=0, **kwargs: Any) NDArray[source]

Concatenate a list of arrays along a specified axis.

Parameters:
  • arrays (list of NDArray) – A list containing two or more NDArray instances to be concatenated.

  • axis (int, optional) – The axis along which the arrays will be concatenated. Default is 0.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A new NDArray containing the concatenated data.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> arr1 = blosc2.arange(0, 5, dtype=np.int32)
>>> arr2 = blosc2.arange(5, 10, dtype=np.int32)
>>> result = blosc2.concat([arr1, arr2])
>>> print(result[:])
[0 1 2 3 4 5 6 7 8 9]
blosc2.copy(array: NDArray, dtype: dtype | str = None, **kwargs: Any) NDArray[source]

This is equivalent to NDArray.copy()

Examples

>>> import numpy as np
>>> import blosc2
>>> # Create an instance of NDArray with some data
>>> original_array = blosc2.asarray(np.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6]]))
>>> # Create a copy of the array without changing dtype
>>> copied_array = blosc2.copy(original_array)
>>> print("Copied array (default dtype):")
>>> print(copied_array)
Copied array (default dtype):
[[1.1 2.2 3.3]
[4.4 5.5 6.6]]
blosc2.empty(shape: int | tuple | list, dtype: ~numpy.dtype | str | None = <class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Create an empty array.

Parameters:
  • shape (int, tuple or list) – The shape for the final array.

  • dtype (np.dtype or list str) – The data type of the array elements in NumPy format. Default is np.uint8. This will override the typesize in the compression parameters if they are provided.

  • kwargs (dict, optional) –

    Keyword arguments supported:
    chunks: tuple or list

    The chunk shape. If None (default), Blosc2 will compute an efficient chunk shape.

    blocks: tuple or list

    The block shape. If None (default), Blosc2 will compute an efficient block shape. This will override the blocksize in the cparams if they are provided.

    The other keyword arguments supported are the same as for the SChunk.__init__ constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [20, 20]
>>> dtype = np.int32
>>> # Create empty array with default chunks and blocks
>>> array = blosc2.empty(shape, dtype=dtype)
>>> array.shape
(20, 20)
>>> array.dtype
dtype('int32')
blosc2.expand_dims(array: NDArray, axis=0) NDArray[source]

Expand the shape of an array by adding new axes at the specified positions.

Parameters:
  • array (NDArray) – The array to be expanded.

  • axis (int or list of int, optional) – Position in the expanded axes where the new axis (or axes) is placed. Default is 0.

Returns:

out – A new NDArray with the expanded shape.

Return type:

NDArray

blosc2.eye(N, M=None, k=0, dtype=<class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Return a 2-D array with ones on the diagonal and zeros elsewhere.

Parameters:
  • N (int) – Number of rows in the output.

  • M (int, optional) – Number of columns in the output. If None, defaults to N.

  • k (int, optional) – Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.

  • dtype (np.dtype or list str) – The data type of the array elements in NumPy format. Default is np.float64.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> array = blosc2.eye(2, 3, dtype=np.int32)
>>> print(array[:])
[[1 0 0]
 [0 1 0]]
blosc2.frombuffer(buffer: bytes, shape: int | tuple | list, dtype: ~numpy.dtype | str = <class 'numpy.uint8'>, **kwargs: ~typing.Any) NDArray[source]

Create an array out of a buffer.

Parameters:
  • buffer (bytes) – The buffer of the data to populate the container.

  • shape (int, tuple or list) – The shape for the final container.

  • dtype (np.dtype or list str) – The ndarray dtype in NumPy format. Default is np.uint8. This will override the typesize in the cparams if they are passed.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [25, 10]
>>> chunks = (49, 49)
>>> dtype = np.dtype("|S8")
>>> typesize = dtype.itemsize
>>> # Create a buffer
>>> buffer = bytes(np.random.normal(0, 1, np.prod(shape)) * typesize)
>>> # Create a NDArray from a buffer with default blocks
>>> a = blosc2.frombuffer(buffer, shape, chunks=chunks, dtype=dtype)
blosc2.fromiter(iterable, shape, dtype, c_order=True, **kwargs) NDArray[source]

Create a new array from an iterable object.

Parameters:
  • iterable (iterable) – An iterable object providing data for the array.

  • shape (int, tuple or list) – The shape of the final array.

  • dtype (np.dtype or list str) – The data type of the array elements in NumPy format.

  • c_order (bool) – Whether to store the array in C order (row-major) or insertion order. Insertion order means that iterable values will be stored in the array following the order of chunks in the array; this is more memory efficient, as it does not require an intermediate copy of the array. Default is C order.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> # Create an array from an iterable
>>> array = blosc2.fromiter(range(10), shape=(10,), dtype=np.int64)
>>> print(array[:])
[0 1 2 3 4 5 6 7 8 9]
blosc2.full(shape: int | tuple | list, fill_value: bytes | int | float | bool, dtype: dtype | str = None, **kwargs: Any) NDArray[source]

Create an array, with fill_value being used as the default value for uninitialized portions of the array.

Parameters:
  • shape (int, tuple or list) – The shape of the final array.

  • fill_value (bytes, int, float or bool) – Default value to use for uninitialized portions of the array. Its size will override the typesize in the cparams if they are passed.

  • dtype (np.dtype or list str) – The ndarray dtype in NumPy format. By default, this will be taken from the fill_value. This will override the typesize in the cparams if they are passed.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [25, 10]
>>> # Create array filled with True
>>> array = blosc2.full(shape, True)
>>> array.shape
(25, 10)
>>> array.dtype
dtype('bool')
blosc2.linspace(start, stop, num=None, endpoint=True, dtype=<class 'numpy.float64'>, shape=None, c_order=True, **kwargs: ~typing.Any) NDArray[source]

Return evenly spaced numbers over a specified interval.

This is similar to numpy.linspace but it returns a NDArray instead of a numpy array. Also, it supports a shape parameter to return a ndim array.

Parameters:
  • start (int, float, complex or np.number) – The starting value of the sequence.

  • stop (int, float, complex or np.number) – The end value of the sequence.

  • num (int) – Number of samples to generate.

  • endpoint (bool) – If True, stop is the last sample. Otherwise, it is not included.

  • dtype (np.dtype or list str) – The data type of the array elements in NumPy format. Default is np.float64.

  • shape (int, tuple or list) – The shape of the final array. If None, the shape will be guessed from num.

  • c_order (bool) – Whether to store the array in C order (row-major) or insertion order. Insertion order means that values will be stored in the array following the order of chunks in the array; this is more memory efficient, as it does not require an intermediate copy of the array. Default is C order.

Returns:

out – A NDArray is returned.

Return type:

NDArray

blosc2.nans(shape: int | tuple | list, dtype: ~numpy.dtype | str = <class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Create an array with NaNs values.

The parameters and keyword arguments are the same as for the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> shape = [8, 8]
>>> chunks = [6, 5]
>>> # Create an array of NaNs
>>> array = blosc2.nans(shape, dtype='f8', chunks=chunks)
>>> array.shape
(8, 8)
>>> array.chunks
(6, 5)
>>> array.dtype
dtype('float64')
blosc2.ndarray_from_cframe(cframe: bytes | str, copy: bool = False) NDArray[source]

Create a NDArray instance from a contiguous frame buffer.

Parameters:
  • cframe (bytes or str) – The bytes object containing the in-memory cframe.

  • copy (bool) – Whether to internally make a copy. If False, the user is responsible for keeping a reference to cframe. Default is False.

Returns:

out – A new NDArray containing the data passed.

Return type:

NDArray

See also

to_cframe()

blosc2.ones(shape: int | tuple | list, dtype: ~numpy.dtype | str = <class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Create an array with one as values.

The parameters and keyword arguments are the same as for the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [8, 8]
>>> chunks = [6, 5]
>>> blocks = [5, 5]
>>> dtype = np.float64
>>> # Create ones array
>>> array = blosc2.ones(shape, dtype=dtype, chunks=chunks, blocks=blocks)
>>> array.shape
(8, 8)
>>> array.chunks
(6, 5)
>>> array.blocks
(5, 5)
>>> array.dtype
dtype('float64')
blosc2.reshape(src: NDArray | NDField | LazyArray | C2Array, shape: tuple | list, c_order: bool = True, **kwargs: Any) NDArray[source]

Returns an array containing the same data with a new shape.

This only works when src.shape is 1-dimensional. Multidim case for src is interesting, but not supported yet.

Parameters:
  • src (NDArray or NDField or LazyArray or C2Array) – The input array.

  • shape (tuple or list) – The new shape of the array. It should have the same number of elements as the current shape.

  • c_order (bool) – Whether to reshape the array in C order (row-major) or insertion order. Insertion order means that values will be stored in the array following the order of chunks in the source array. Default is C order.

  • kwargs (dict, optional) – Additional keyword arguments supported by the empty() constructor.

Returns:

out – A new array with the requested shape.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [23 * 11]
>>> a = np.arange(np.prod(shape))
>>> # Create an array
>>> b = blosc2.asarray(a)
>>> # Reshape the array
>>> c = blosc2.reshape(b, (11, 23))
>>> print(c.shape)
(11, 23)
blosc2.stack(arrays: list[NDArray], axis=0, **kwargs: Any) NDArray[source]

Stack multiple arrays, creating a new axis.

Parameters:
  • arrays (list of NDArray) – A list containing two or more NDArray instances to be stacked.

  • axis (int, optional) – The new axis along which the arrays will be stacked. Default is 0.

  • kwargs (dict, optional) – Keyword arguments that are supported by the empty() constructor.

Returns:

out – A new NDArray containing the stacked data.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> arr1 = blosc2.arange(0, 6, dtype=np.int32, shape=(2,3))
>>> arr2 = blosc2.arange(6, 12, dtype=np.int32, shape=(2,3))
>>> result = blosc2.stack([arr1, arr2])
>>> print(result.shape)
(2, 2, 3)
blosc2.uninit(shape: int | tuple | list, dtype: ~numpy.dtype | str = <class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Create an array with uninitialized values.

The parameters and keyword arguments are the same as for the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> shape = [8, 8]
>>> chunks = [6, 5]
>>> # Create uninitialized array
>>> array = blosc2.uninit(shape, dtype='f8', chunks=chunks)
>>> array.shape
(8, 8)
>>> array.chunks
(6, 5)
>>> array.dtype
dtype('float64')
blosc2.zeros(shape: int | tuple | list, dtype: ~numpy.dtype | str = <class 'numpy.float64'>, **kwargs: ~typing.Any) NDArray[source]

Create an array with zero as the default value for uninitialized portions of the array.

The parameters and keyword arguments are the same as for the empty() constructor.

Returns:

out – A NDArray is returned.

Return type:

NDArray

Examples

>>> import blosc2
>>> import numpy as np
>>> shape = [8, 8]
>>> chunks = [6, 5]
>>> blocks = [5, 5]
>>> dtype = np.float64
>>> # Create zeros array
>>> array = blosc2.zeros(shape, dtype=dtype, chunks=chunks, blocks=blocks)
>>> array.shape
(8, 8)
>>> array.chunks
(6, 5)
>>> array.blocks
(5, 5)
>>> array.dtype
dtype('float64')