Super-chunk#

This API describes the new Blosc 2 container, the super-chunk (or schunk for short).

struct blosc2_storage#

This struct is meant for holding storage parameters for a for a blosc2 container, allowing to specify, for example, how to interpret the contents included in the schunk.

Public Members

bool contiguous#

Whether the chunks are contiguous or sparse.

char *urlpath#

The path for persistent storage. If NULL, that means in-memory.

blosc2_cparams *cparams#

The compression params when creating a schunk.

If NULL, sensible defaults are used depending on the context.

blosc2_dparams *dparams#

The decompression params when creating a schunk.

If NULL, sensible defaults are used depending on the context.

blosc2_io *io#

Input/output backend.

struct blosc2_schunk#

This struct is the standard container for Blosc 2 compressed data.

This is essentially a container for Blosc 1 chunks of compressed data, and it allows to overcome the 32-bit limitation in Blosc 1. Optionally, a blosc2_frame can be attached so as to store the compressed chunks contiguously.

Public Members

uint8_t compcode#

The default compressor. Each chunk can override this.

uint8_t compcode_meta#

The default compressor metadata. Each chunk can override this.

uint8_t clevel#

The compression level and other compress params.

uint8_t splitmode#

The split mode.

int32_t typesize#

The type size.

int32_t blocksize#

The requested size of the compressed blocks (0; meaning automatic).

int32_t chunksize#

Size of each chunk. 0 if not a fixed chunksize.

uint8_t filters[BLOSC2_MAX_FILTERS]#

The (sequence of) filters. 8-bit per filter.

uint8_t filters_meta[BLOSC2_MAX_FILTERS]#

Metadata for filters. 8-bit per meta-slot.

int64_t nchunks#

Number of chunks in super-chunk.

int64_t current_nchunk#

The current chunk that is being accessed.

int64_t nbytes#

The data size (uncompressed).

int64_t cbytes#

The data size + chunks header size (compressed).

uint8_t **data#

Pointer to chunk data pointers buffer.

size_t data_len#

Length of the chunk data pointers buffer.

blosc2_storage *storage#

Pointer to storage info.

blosc2_frame *frame#

Pointer to frame used as store for chunks.

uint8_t* ctx; Context for the thread holder. NULL if not acquired.

blosc2_context *cctx#

Context for compression.

blosc2_context *dctx#

Context for decompression.

struct blosc2_metalayer *metalayers[16]#

The array of metalayers.

uint16_t nmetalayers#

The number of metalayers in the super-chunk.

int16_t nvlmetalayers#

The number of variable-length metalayers.

void *tuner_params#

Tune configuration.

blosc2_schunk *blosc2_schunk_new(blosc2_storage *storage)#

Create a new super-chunk.

Remark

In case that storage.urlpath is not NULL, the data is stored on-disk. If the data file(s) exist, they are overwritten.

Parameters
  • storage – The storage properties.

Returns

The new super-chunk.

int blosc2_schunk_free(blosc2_schunk *schunk)#

Release resources from a super-chunk.

Remark

All the memory resources attached to the super-chunk are freed. If the super-chunk is on-disk, the data continues there for a later re-opening.

Parameters
  • schunk – The super-chunk to be freed.

Returns

0 if success.

blosc2_schunk *blosc2_schunk_open(const char *urlpath)#

Open an existing super-chunk that is on-disk (frame).

No in-memory copy is made.

Parameters
  • urlpath – The file name.

Returns

The new super-chunk. NULL if not found or not in frame format.

blosc2_schunk *blosc2_schunk_open_offset(const char *urlpath, int64_t offset)#

Open an existing super-chunk that is on-disk (frame).

No in-memory copy is made.

Parameters
  • urlpath – The file name.

  • offset – The frame offset.

Returns

The new super-chunk. NULL if not found or not in frame format.

blosc2_schunk *blosc2_schunk_open_udio(const char *urlpath, const blosc2_io *udio)#

Open an existing super-chunk (no copy is made) using a user-defined I/O interface.

Parameters
  • urlpath – The file name.

  • udio – The user-defined I/O interface.

Returns

The new super-chunk.

blosc2_schunk *blosc2_schunk_copy(blosc2_schunk *schunk, blosc2_storage *storage)#

Create a copy of a super-chunk.

Parameters
  • schunk – The super-chunk to be copied.

  • storage – The storage properties.

Returns

The new super-chunk.

blosc2_schunk *blosc2_schunk_from_buffer(uint8_t *cframe, int64_t len, bool copy)#

Create a super-chunk out of a contiguous frame buffer.

Remark

If copy is false, the cframe buffer passed will be owned by the super-chunk and will be automatically freed when blosc2_schunk_free() is called. If the user frees it after the opening, bad things will happen. Don’t do that (or set copy).

Parameters
  • cframe – The buffer of the in-memory frame.

  • copy – Whether the super-chunk should make a copy of the cframe data or not. The copy will be made to an internal sparse frame.

  • len – The length of the buffer (in bytes).

Returns

The new super-chunk.

int64_t blosc2_schunk_to_buffer(blosc2_schunk *schunk, uint8_t **cframe, bool *needs_free)#
int64_t blosc2_schunk_to_file(blosc2_schunk *schunk, const char *urlpath)#
int64_t blosc2_schunk_append_file(blosc2_schunk *schunk, const char *urlpath)#
int blosc2_schunk_get_cparams(blosc2_schunk *schunk, blosc2_cparams **cparams)#

Return the cparams associated to a super-chunk.

Parameters
  • schunk – The super-chunk from where to extract the compression parameters.

  • cparams – The pointer where the compression params will be returned.

Returns

0 if succeeds. Else a negative code is returned.

Warning

A new struct is allocated, and the user should free it after use.

int blosc2_schunk_get_dparams(blosc2_schunk *schunk, blosc2_dparams **dparams)#

Return the dparams struct associated to a super-chunk.

Parameters
  • schunk – The super-chunk from where to extract the decompression parameters.

  • dparams – The pointer where the decompression params will be returned.

Returns

0 if succeeds. Else a negative code is returned.

Warning

A new struct is allocated, and the user should free it after use.

int blosc2_schunk_reorder_offsets(blosc2_schunk *schunk, int64_t *offsets_order)#

Reorder the chunk offsets of an existing super-chunk.

Parameters
  • schunk – The super-chunk whose chunk offsets are to be reordered.

  • offsets_order – The new order of the chunk offsets.

Returns

0 if succeeds. Else a negative code is returned.

int64_t blosc2_schunk_frame_len(blosc2_schunk *schunk)#

Get the length (in bytes) of the internal frame of the super-chunk.

Parameters
  • schunk – The super-chunk.

Returns

The length (in bytes) of the internal frame. If there is not an internal frame, an estimate of the length is provided.

int64_t blosc2_schunk_fill_special(blosc2_schunk *schunk, int64_t nitems, int special_value, int32_t chunksize)#

Quickly fill an empty frame with special values (zeros, NaNs, uninit).

Parameters
  • schunk – The super-chunk to be filled. This must be empty initially.

  • nitems – The number of items to fill.

  • special_value – The special value to use for filling. The only values supported for now are BLOSC2_SPECIAL_ZERO, BLOSC2_SPECIAL_NAN and BLOSC2_SPECIAL_UNINIT.

  • chunksize – The chunksize for the chunks that are to be added to the super-chunk.

Returns

The total number of chunks that have been added to the super-chunk. If there is an error, a negative value is returned.

int64_t blosc2_schunk_append_buffer(blosc2_schunk *schunk, const void *src, int32_t nbytes)#

Append a src data buffer to a super-chunk.

Parameters
  • schunk – The super-chunk where data will be appended.

  • src – The buffer of data to compress.

  • nbytes – The size of the src buffer.

Returns

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int blosc2_schunk_get_slice_buffer(blosc2_schunk *schunk, int64_t start, int64_t stop, void *buffer)#

Fill buffer with a schunk slice.

Parameters
  • schunk – The super-chunk from where to extract a slice.

  • start – Index (0-based) where the slice begins.

  • stop – The first index (0-based) that is not in the selected slice.

  • buffer – The buffer where the data will be stored.

Returns

An error code.

Warning

You must make sure that you have enough space in buffer to store the uncompressed data.

int blosc2_schunk_set_slice_buffer(blosc2_schunk *schunk, int64_t start, int64_t stop, void *buffer)#

Update a schunk slice from buffer.

Parameters
  • schunk – The super-chunk where to set the slice.

  • start – Index (0-based) where the slice begins.

  • stop – The first index (0-based) that is not in the selected slice.

  • buffer – The buffer containing the data to set.

Returns

An error code.

void blosc2_schunk_avoid_cframe_free(blosc2_schunk *schunk, bool avoid_cframe_free)#

Set the private avoid_cframe_free field in a frame.

Parameters
  • schunk – The super-chunk referencing the frame.

  • avoid_cframe_free – The value to set in the blosc2_frame_s structure.

Warning

If you set it to true you will be responsible of freeing it.

Dealing with chunks#

int blosc2_schunk_get_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t **chunk, bool *needs_free)#

Return a compressed chunk that is part of a super-chunk in the chunk parameter.

Parameters
  • schunk – The super-chunk from where to extract a chunk.

  • nchunk – The chunk to be extracted (0 indexed).

  • chunk – The pointer to the chunk of compressed data.

  • needs_free – The pointer to a boolean indicating if it is the user’s responsibility to free the chunk returned or not.

Returns

The size of the (compressed) chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead.

Warning

If the super-chunk is backed by a frame that is disk-based, a buffer is allocated for the (compressed) chunk, and hence a free is needed. You can check whether the chunk requires a free with the needs_free parameter. If the chunk does not need a free, it means that a pointer to the location in the super-chunk (or the backing in-memory frame) is returned in the chunk parameter.

int blosc2_schunk_get_lazychunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t **chunk, bool *needs_free)#

Return a (lazy) compressed chunk that is part of a super-chunk in the chunk parameter.

Parameters
  • schunk – The super-chunk from where to extract a chunk.

  • nchunk – The chunk to be extracted (0 indexed).

  • chunk – The pointer to the (lazy) chunk of compressed data.

  • needs_free – The pointer to a boolean indicating if it is the user’s responsibility to free the chunk returned or not.

Returns

The size of the (compressed) chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead. Note that a lazy chunk is somewhat larger than a regular chunk because of the trailer section (for details see README_CHUNK_FORMAT.rst).

Note

For disk-based frames, a lazy chunk is always returned.

Warning

Currently, a lazy chunk can only be used by blosc2_decompress_ctx and blosc2_getitem_ctx.

Warning

If the super-chunk is backed by a frame that is disk-based, a buffer is allocated for the (compressed) chunk, and hence a free is needed. You can check whether requires a free with the needs_free parameter. If the chunk does not need a free, it means that a pointer to the location in the super-chunk (or the backing in-memory frame) is returned in the chunk parameter. In this case the returned chunk is not lazy.

int blosc2_schunk_decompress_chunk(blosc2_schunk *schunk, int64_t nchunk, void *dest, int32_t nbytes)#

Decompress and return the nchunk chunk of a super-chunk.

If the chunk is uncompressed successfully, it is put in the *dest pointer.

Parameters
  • schunk – The super-chunk from where the chunk will be decompressed.

  • nchunk – The chunk to be decompressed (0 indexed).

  • dest – The buffer where the decompressed data will be put.

  • nbytes – The size of the area pointed by *dest.

Returns

The size of the decompressed chunk or 0 if it is non-initialized. If some problem is detected, a negative code is returned instead.

Warning

You must make sure that you have enough space to store the uncompressed data.

int64_t blosc2_schunk_append_chunk(blosc2_schunk *schunk, uint8_t *chunk, bool copy)#

Append an existing chunk to a super-chunk.

Parameters
  • schunk – The super-chunk where the chunk will be appended.

  • chunk – The chunk to append. An internal copy is made, so chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_insert_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t *chunk, bool copy)#

Insert a chunk at a specific position in a super-chunk.

Parameters
  • schunk – The super-chunk where the chunk will be appended.

  • nchunk – The position where the chunk will be inserted.

  • chunk – The chunk to insert. If an internal copy is made, the chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_update_chunk(blosc2_schunk *schunk, int64_t nchunk, uint8_t *chunk, bool copy)#

Update a chunk at a specific position in a super-chunk.

Parameters
  • schunk – The super-chunk where the chunk will be updated.

  • nchunk – The position where the chunk will be updated.

  • chunk – The new chunk. If an internal copy is made, the chunk can be reused or freed if desired.

  • copy – Whether the chunk should be copied internally or can be used as-is.

Returns

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

int64_t blosc2_schunk_delete_chunk(blosc2_schunk *schunk, int64_t nchunk)#

Delete a chunk at a specific position in a super-chunk.

Parameters
  • schunk – The super-chunk where the chunk will be deleted.

  • nchunk – The position where the chunk will be deleted.

Returns

The number of chunks in super-chunk. If some problem is detected, this number will be negative.

Creating chunks#

int blosc2_chunk_zeros(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of zeros.

Parameters
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

int blosc2_chunk_nans(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of nans.

Parameters
  • cparams – The compression parameters; only 4 bytes (float) and 8 bytes (double) are supported.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

Note

Whether the NaNs are floats or doubles will be given by the typesize.

int blosc2_chunk_repeatval(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize, const void *repeatval)#

Create a chunk made of repeated values.

Parameters
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer.

  • repeatval – A pointer to the repeated value (little endian). The size of the value is given by cparams.typesize param.

Returns

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH + typesize). If negative, there has been an error and dest is unusable.

int blosc2_chunk_uninit(blosc2_cparams cparams, int32_t nbytes, void *dest, int32_t destsize)#

Create a chunk made of uninitialized values.

Parameters
  • cparams – The compression parameters.

  • nbytes – The size (in bytes) of the chunk.

  • dest – The buffer where the data chunk will be put.

  • destsize – The size (in bytes) of the dest buffer; must be BLOSC_EXTENDED_HEADER_LENGTH at least.

Returns

The number of bytes compressed (BLOSC_EXTENDED_HEADER_LENGTH). If negative, there has been an error and dest is unusable.

Frame specific functions#

int64_t *blosc2_frame_get_offsets(blosc2_schunk *schunk)#

Get the offsets of a frame in a super-chunk.

Parameters
  • schunk – The super-chunk containing the frame.

Returns

If successful, return a pointer to a buffer of the decompressed offsets. The number of offsets is equal to schunk->nchunks; the user is responsible to free this buffer. Else, return a NULL value.