Postfilters: compute after the filter pipeline#

Similarly to the prefilters, in python-blosc2 you can also set a python function as a postfilter in order to be executed after decompressing the data. Let’s see how it works with a simple example!

Setting a postfilter#

As in the prefilters, for setting a postfilter to a schunk, the number of threads for decompression has to be 1:

import blosc2
import numpy as np

cparams = {
    "typesize": 8,

dparams = {
    "nthreads": 1,

storage = {
    "cparams": cparams,
    "dparams": dparams,

chunk_len = 10_000
data = np.zeros(chunk_len * 3, dtype=np.int64)
schunk = blosc2.SChunk(chunksize=chunk_len * 8, data=data, **storage)

Great! Now we can create our postfilter with its decorator. For that, you will first have to create a function that receives three params: input, output and the offset in schunk where the block starts. Then, you will use the decorator and pass to it the input data type that the postfilter will receive and the output data type that it will fill:

input_dtype = np.int64

def postfilter(input, output, offset):
    output[:] = input + np.arange(input.size) + offset

Let’s check that the postfilter is being executed when reading data:

out = np.empty(data.size, dtype=input_dtype)
array([    0,     1,     2, ..., 29997, 29998, 29999])

Perfect, we have implemented an arange with a postfilter!

Removing a postfilter#

If we do not want the postfilter to be executed anymore, we can remove it from the schunk with:


Re-enabling parallelism#

Now that we do not have a postfilter, it is safe to activate multi-threading:

schunk.dparams = {"nthreads": 8}

Finally, let’s check that the data stored in the schunk is the actual data passed in the schunk constructor:

array([0, 0, 0, ..., 0, 0, 0])

Postfilters can also be applied to a NDArray data through its SChunk unidimensional chunks (NDArray.schunk).

That’s all for now. There are more examples in the examples directory for you to explore. Enjoy!