LazyArray UDF DSL Kernels¶
@blosc2.dsl_kernel lets you write kernels with Python function syntax while executing through the miniexpr DSL path.
Use DSL kernels when you want:
A vectorized UDF model (operate over NDArray chunks/blocks, not Python scalar loops)
Optional JIT compilation via miniexpr backends (for example
tcc/cc) without requiring NumbaEarly syntax validation and actionable diagnostics for unsupported constructs
This tutorial complements 03.lazyarray-udf.ipynb (generic Python UDFs).
For the canonical DSL syntax contract, see the DSL syntax reference.
Choosing the Right Interface¶
Goal |
Recommended API |
|---|---|
Elementwise formulas using built-in functions/operators |
|
Arbitrary Python logic over blocks/chunks |
|
DSL subset with early syntax checks and optional miniexpr JIT |
|
[1]:
import numpy as np
import blosc2
1. Define a DSL Kernel¶
A valid DSL kernel can be used with blosc2.lazyudf(...) like a regular UDF.
[2]:
@blosc2.dsl_kernel
def kernel_index_ramp(x):
# _i* and _n* are reserved DSL index/shape symbols, so disable linter warnings
return x + _i0 * _n1 + _i1 # noqa: F821
[3]:
shape = (5, 10)
x = blosc2.ones(shape, dtype=np.float32)
expr = blosc2.lazyudf(kernel_index_ramp, (x,), dtype=np.float32)
res = expr[:]
res
[3]:
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[11., 12., 13., 14., 15., 16., 17., 18., 19., 20.],
[21., 22., 23., 24., 25., 26., 27., 28., 29., 30.],
[31., 32., 33., 34., 35., 36., 37., 38., 39., 40.],
[41., 42., 43., 44., 45., 46., 47., 48., 49., 50.]], dtype=float32)
[4]:
# Optional: request miniexpr JIT backend for this DSL kernel
try:
expr_jit = blosc2.lazyudf(
kernel_index_ramp,
(x,),
dtype=x.dtype,
jit=True,
jit_backend="tcc",
)
res_jit = expr_jit.compute()
res_jit[:2, :5]
except Exception as e:
print(f"JIT backend unavailable in this environment: {e}")
1.a Zero-Parameter DSL Kernel¶
Kernels with no parameters are also valid. When inputs is empty, you must pass an explicit output shape to lazyudf(...).
[5]:
@blosc2.dsl_kernel
def kernel_no_inputs():
return _i0 + 10 * _i1 # noqa: F821
expr0 = blosc2.lazyudf(kernel_no_inputs, (), dtype=np.int32, shape=(3, 4))
res0 = expr0[:]
res0
[5]:
array([[ 0, 10, 20, 30],
[ 1, 11, 21, 31],
[ 2, 12, 22, 32]], dtype=int32)
1.b DSL Kernel with Multiple Parameters¶
Kernels with more than one parameter work the same way; all inputs are passed through lazyudf(...) in a tuple.
[6]:
@blosc2.dsl_kernel
def kernel_weighted_mix(x, y, b):
return 0.25 * x + 2.0 * y + b
xw = blosc2.asarray(np.arange(12, dtype=np.float32).reshape(3, 4))
yw = blosc2.ones((3, 4), dtype=np.float32)
bw = 32.4
resw = blosc2.lazyudf(kernel_weighted_mix, (xw, yw, bw), dtype=np.float32)[:]
resw[:2, :3]
[6]:
array([[34.4 , 34.65, 34.9 ],
[35.4 , 35.65, 35.9 ]], dtype=float32)
2. Preflight Validation (validate_dsl)¶
You can validate a kernel and inspect diagnostics without executing it.
Common Diagnostics Cheat Sheet¶
Ternary expression (
a if cond else b) is unsupported: usewhere(cond, a, b).Reserved names (
int,float,bool,print,_ndim,_i*,_n*) cannot be reused.Missing return on an executed path can fail at runtime, even if compilation succeeds.
[7]:
report_ok = blosc2.validate_dsl(kernel_index_ramp)
report_ok
[7]:
{'valid': True,
'dsl_source': 'def kernel_index_ramp(x):\n # _i* and _n* are reserved DSL index/shape symbols, so disable linter warnings\n return x + _i0 * _n1 + _i1 # noqa: F821',
'input_names': ['x'],
'error': None}
3. Invalid Syntax Examples¶
validate_dsl helps catch unsupported constructs early, before running lazyudf(...).
3.a Ternary Expressions Are Not Supported¶
[8]:
@blosc2.dsl_kernel
def kernel_invalid_ternary(x):
return 1 if x else 0
[9]:
report_bad_ternary = blosc2.validate_dsl(kernel_invalid_ternary)
print(report_bad_ternary["valid"])
print(report_bad_ternary["error"])
False
Ternary expressions are not supported in DSL; use where(cond, a, b) at line 2, column 14
DSL kernel source:
1 | def kernel_invalid_ternary(x):
2 | return 1 if x else 0
| ^
See: https://github.com/Blosc/miniexpr/blob/main/doc/dsl-usage.md
3.b Reserved Names Cannot Be Reused¶
[15]:
@blosc2.dsl_kernel
def kernel_invalid_reserved_name(x):
int = x + 1
return int + 2
[11]:
report_bad_reserved = blosc2.validate_dsl(kernel_invalid_reserved_name)
print(report_bad_reserved["valid"])
print(report_bad_reserved["error"])
True
None
4. Control Flow and Casts¶
The DSL supports if/else blocks and cast intrinsics such as float(...).
[12]:
@blosc2.dsl_kernel
def kernel_clip_and_scale(x):
if x < 0:
y = 0
else:
y = x
return float(y) * 0.5
x2_np = np.linspace(-2.0, 2.0, num=10, dtype=np.float32).reshape(2, 5)
x2 = blosc2.asarray(x2_np)
res2 = blosc2.lazyudf(kernel_clip_and_scale, (x2,), dtype=np.float32)[:]
res2
[12]:
array([[0. , 0. , 0. , 0. , 0. ],
[0.11111111, 0.33333334, 0.5555556 , 0.7777778 , 1. ]],
dtype=float32)
5. Loops and Reserved ND Symbols¶
You can use for ... in range(...) together with reserved symbols like _i0, _i1, _n0, _n1 and _flat_idx.
[13]:
@blosc2.dsl_kernel
def kernel_add_triangular_col_index(x):
acc = 0
for j in range(_i1 + 1): # noqa: F821
acc += j
return x + acc
x3 = blosc2.zeros((2, 5), dtype=np.float32)
res3 = blosc2.lazyudf(kernel_add_triangular_col_index, (x3,), dtype=np.float32)[:]
res3
[13]:
array([[ 0., 1., 3., 6., 10.],
[ 0., 1., 3., 6., 10.]], dtype=float32)
[14]:
expected = np.array([0, 1, 3, 6, 10], dtype=np.float32)
np.allclose(res3[0], expected), res3[0]
[14]:
(True, array([ 0., 1., 3., 6., 10.], dtype=float32))
6. Advanced Examples¶
For more advanced real-world DSL kernels, see:
examples/ndarray/mandelbrot-dsl.ipynbexamples/ndarray/black-scholes_hist-dsl.ipynb
GitHub links: