blosc2.lazyudf#

blosc2.lazyudf(func: Callable[[tuple, np.ndarray, tuple[int]], None], inputs: tuple | list | None, dtype: np.dtype, shape: tuple | list | None = None, chunked_eval: bool = True, **kwargs: Any) LazyUDF#

Get a LazyUDF from a python user-defined function.

Parameters:
  • func (Python function) – The user-defined function to apply to each block. This function will always receive the following parameters: - inputs_tuple: A tuple containing the corresponding slice for the block of each input in inputs. - output: The buffer to be filled as a multidimensional numpy.ndarray. - offset: The multidimensional offset corresponding to the start of the block being computed.

  • inputs (tuple or list or None) – The sequence of inputs. Supported inputs are: NumPy.ndarray, NDArray, NDField, C2Array. Any other object is supported too, and will be passed as is to the user-defined function. If not needed, this can be empty, but shape must be provided.

  • dtype (np.dtype) – The resulting ndarray dtype in NumPy format.

  • shape (tuple, optional) – The shape of the resulting array. If None, the shape will be guessed from inputs.

  • chunked_eval (bool, optional) – Whether to evaluate the function in chunks or not (blocks).

  • kwargs (Any, optional) – Keyword arguments that are supported by the empty() constructor. These arguments will be used by the LazyArray.__getitem__() and LazyArray.compute() methods. The last one will ignore the urlpath parameter passed in this function.

Returns:

out – A Utilities is returned.

Return type:

Utilities

Examples

>>> import blosc2
>>> import numpy as np
>>> dtype = np.float64
>>> shape = [3, 3]
>>> size = shape[0] * shape[1]
>>> a = np.linspace(0, 10, num=size, dtype=dtype).reshape(shape)
>>> b = np.linspace(10, 20, num=size, dtype=dtype).reshape(shape)
>>> a1 = blosc2.asarray(a)
>>> b1 = blosc2.asarray(b)
>>> # Define a user-defined function that will be applied to each block of data
>>> def my_function(inputs_tuple, output, offset):
>>>     a, b = inputs_tuple
>>>     output[:] = a + b
>>> # Create a LazyUDF object using the user-defined function
>>> lazy_udf = blosc2.lazyudf(my_function, [a1, b1], dtype)
>>> type(lazy_udf)
<class 'blosc2.lazyexpr.LazyUDF'>
>>> f"Result of LazyUDF evaluation: {lazy_udf[:]}"
Result of LazyUDF evaluation:
        [[10.  12.5 15. ]
        [17.5 20.  22.5]
        [25.  27.5 30. ]]