Library Reference#

First level variables#

__version__#: The version of the blosc package.

blosclib_version#: The version of the Blosc C library.

clib_versions#: A map for the versions of the compression libraries included in C library.

cnames#: The list of compressors included in C library.

cname2clib#: A map between compressor names and its libraries (or formats).

ncores#: The number of cores detected.

Public functions#

blosc.compress(bytesobj[, typesize=8, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz']])#

Compress bytesobj, with a given type size.

Parameters:

bytesobjbytes-like object (supporting the buffer interface): The data to be compressed.
typesizeint: The data type size.
clevelint (optional): The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffleint (optional): The shuffle filter to be activated. Allowed values are blosc.NOSHUFFLE, blosc.SHUFFLE and blosc.BITSHUFFLE. The default is blosc.SHUFFLE.
cnamestring (optional): The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’, ‘zstd’ and maybe others too). The default is ‘blosclz’.

Returns:

outstr / bytes: The compressed data in form of a Python str / bytes object.

Raises:

TypeError: If bytesobj doesn’t support the buffer interface.
ValueError: If bytesobj is too long. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not a valid codec.

Examples

>>> import array, sys
>>> a = array.array('i', range(1000*1000))
>>> a_bytesobj = a.tobytes()
>>> c_bytesobj = blosc.compress(a_bytesobj, typesize=4)
>>> len(c_bytesobj) < len(a_bytesobj)
True

blosc.compress_ptr(address, items[, typesize=8, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz']])#

Compress the data at address with given items and typesize.

Parameters:

addressint or long: the pointer to the data to be compressed
itemsint: The number of items (of typesize) to be compressed.
typesizeint: The data type size.
clevelint (optional): The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffleint (optional): The shuffle filter to be activated. Allowed values are blosc.NOSHUFFLE, blosc.SHUFFLE and blosc.BITSHUFFLE. The default is blosc.SHUFFLE.
cnamestring (optional): The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’, ‘zstd’ and maybe others too). The default is ‘blosclz’.

Returns:

outstr / bytes: The compressed data in form of a Python str / bytes object.

Raises:

TypeError: If address is not of type int or long.
ValueError: If items * typesize is larger than the maximum allowed buffer size. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.

Notes

This function can be used anywhere that a memory address is available in Python. For example the Numpy “__array_interface__[‘data’][0]” construct, or when using the ctypes modules.

Importantly, the user is responsible for making sure that the memory address is valid and that the memory pointed to is contiguous. Passing a non-valid address has a high likelihood of crashing the interpreter by segfault.

Examples

>>> import numpy
>>> items = 7
>>> np_array = numpy.arange(items)
>>> c = blosc.compress_ptr(np_array.__array_interface__['data'][0],         items, np_array.dtype.itemsize)
>>> d = blosc.decompress(c)
>>> np_ans = numpy.fromstring(d, dtype=np_array.dtype)
>>> bool((np_array == np_ans).all())
True

>>> import ctypes
>>> typesize = 8
>>> data = [float(i) for i in range(items)]
>>> Array = ctypes.c_double * items
>>> a = Array(*data)
>>> c = blosc.compress_ptr(ctypes.addressof(a), items, typesize)
>>> d = blosc.decompress(c)
>>> import struct
>>> ans = [struct.unpack('d', d[i:i+typesize])[0]             for i in range(0, items*typesize, typesize)]
>>> data == ans
True

blosc.decompress(bytes_like)#

Decompresses a bytes-like compressed object.

Parameters:

bytes_likebytes-like object: The data to be decompressed. Must be a bytes-like object that supports the Python Buffer Protocol, like bytes, bytearray, memoryview, or numpy.ndarray.
as_bytearraybool, optional: If this flag is True then the return type will be a bytearray object instead of a bytesobject.

Returns:

outstr / bytes or bytearray: The decompressed data in form of a Python str / bytes object. If as_bytearray is True then this will be a bytearray object, otherwise this will be a str/ bytes object.

Raises:

TypeError: If bytes_like does not support Buffer Protocol

Examples

>>> import array, sys
>>> a = array.array('i', range(1000*1000))
>>> a_bytesobj = a.tobytes()
>>> c_bytesobj = blosc.compress(a_bytesobj, typesize=4)
>>> a_bytesobj2 = blosc.decompress(c_bytesobj)
>>> a_bytesobj == a_bytesobj2
True
>>> b"" == blosc.decompress(blosc.compress(b"", 1))
True
>>> b"1"*7 == blosc.decompress(blosc.compress(b"1"*7, 8))
True
>>> type(blosc.decompress(blosc.compress(b"1"*7, 8),
...                                      as_bytearray=True)) is bytearray
True

blosc.decompress_ptr(bytes_like, address)#

Decompresses a bytes_like compressed object into the memory at address.

Parameters:

bytes_likebytes-like object: The data to be decompressed. Must be a bytes-like object that supports the Python Buffer Protocol, like bytes, bytearray, memoryview, or numpy.ndarray.
addressint or long: The address at which to write the decompressed data

Returns:

nbytesint: the number of bytes written to the buffer

Raises:

TypeError: If bytesobj is not of type bytes or string. If address is not of type int or long.

Notes

This function can be used anywhere that a memory address is available in Python. For example the Numpy “__array_interface__[‘data’][0]” construct, or when using the ctypes modules.

Importantly, the user is responsible for making sure that the memory address is valid and that the memory pointed to is contiguous and can be written to. Passing a non-valid address has a high likelihood of crashing the interpreter by segfault.

Examples

>>> import numpy
>>> items = 7
>>> np_array = numpy.arange(items)
>>> c = blosc.compress_ptr(np_array.__array_interface__['data'][0],         items, np_array.dtype.itemsize)
>>> np_ans = numpy.empty(items, dtype=np_array.dtype)
>>> nbytes = blosc.decompress_ptr(c, np_ans.__array_interface__['data'][0])
>>> bool((np_array == np_ans).all())
True
>>> nbytes == items * np_array.dtype.itemsize
True

>>> import ctypes
>>> typesize = 8
>>> data = [float(i) for i in range(items)]
>>> Array = ctypes.c_double * items
>>> in_array = Array(*data)
>>> c = blosc.compress_ptr(ctypes.addressof(in_array), items, typesize)
>>> out_array = ctypes.create_string_buffer(items*typesize)
>>> nbytes = blosc.decompress_ptr(c, ctypes.addressof(out_array))
>>> import struct
>>> ans = [struct.unpack('d', out_array[i:i+typesize])[0]             for i in range(0, items*typesize, typesize)]
>>> data == ans
True
>>> nbytes == items * typesize
True

blosc.pack_array(array[, clevel=9, shuffle=blosc.SHUFFLE, cname='blosclz']])#

Pack (compress) a NumPy array.

Parameters:

arrayndarray: The NumPy array to be packed.
clevelint (optional): The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.
shuffleint (optional): The shuffle filter to be activated. Allowed values are blosc.NOSHUFFLE, blosc.SHUFFLE and blosc.BITSHUFFLE. The default is blosc.SHUFFLE.
cnamestring (optional): The name of the compressor used internally in Blosc. It can be any of the supported by Blosc (‘blosclz’, ‘lz4’, ‘lz4hc’, ‘snappy’, ‘zlib’, ‘zstd’ and maybe others too). The default is ‘blosclz’.

Returns:

outstr / bytes: The packed array in form of a Python str / bytes object.

Raises:

TypeError: If array does not quack like a numpy ndarray.
ValueError: If array.itemsize * array.size is larger than the maximum allowed buffer size. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc.pack_array(a)
>>> len(parray) < a.size*a.itemsize
True

blosc.unpack_array(packed_array)#

Unpack (decompress) a packed NumPy array.

Parameters:

packed_arraystr / bytes: The packed array to be decompressed.
**kwargsfix_imports / encoding / errors: Optional parameters that can be passed to the pickle.loads API https://docs.python.org/3/library/pickle.html#pickle.loads

Returns:

outndarray: The decompressed data in form of a NumPy array.

Raises:

TypeError: If packed_array is not of type bytes or string.

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc.pack_array(a)
>>> len(parray) < a.size*a.itemsize
True
>>> a2 = blosc.unpack_array(parray)
>>> bool(numpy.all(a == a2))
True
>>> a = numpy.array(['å', 'ç', 'ø'])
>>> parray = blosc.pack_array(a)
>>> a2 = blosc.unpack_array(parray)
>>> bool(numpy.all(a == a2))
True

Utilities#

blosc.clib_info(cname)#

Return info for compression libraries in C library.

Parameters:

cnamestr: The compressor name.

Returns:

outtuple: The associated library name and version.

blosc.compressor_list()#

Returns a list of compressors available in C library.

Parameters:

None

Returns:

outlist: The list of names.

blosc.detect_number_of_cores()#

Detect the number of cores in this system.

Returns:

outint: The number of cores in this system.

blosc.free_resources()#

Free possible memory temporaries and thread resources.

Returns:

outNone

Notes

Blosc maintain a pool of threads waiting for work as well as some temporary space. You can use this function to release these resources when you are not going to use Blosc for a long while.

Examples

>>> blosc.free_resources()
>>>

blosc.get_clib(bytesobj)#

Return the name of the compression library for Blosc bytesobj buffer.

Parameters:

bytesobjstr / bytes: The compressed buffer.

Returns:

outstr: The name of the compression library.

blosc.print_versions()#: Print all the versions of software that python-blosc relies on.

blosc.set_blocksize(blocksize)#

Force the use of a specific blocksize. If 0, an automatic blocksize will be used (the default).

Notes

This is a low-level function and is recommended for expert users only.

Examples

>>> blosc.set_blocksize(512)
>>> blosc.set_blocksize(0)

blosc.set_nthreads(nthreads)#

Set the number of threads to be used during Blosc operation.

Parameters:

nthreadsint: The number of threads to be used during Blosc operation.

Returns:

outint: The previous number of used threads.

Raises:

ValueError: If nthreads is larger that the maximum number of threads blosc can use.

Notes

The number of threads for Blosc is the maximum number of cores detected on your machine (via detect_number_of_cores). In some cases Blosc gets better results if you set the number of threads to a value slightly below than your number of cores.

Examples

Set the number of threads to 2 and then to 1:

>>> oldn = blosc.set_nthreads(2)
>>> blosc.set_nthreads(1)
2

blosc.set_releasegil(gitstate)#

Sets a boolean on whether to release the Python global inter-lock (GIL) during c-blosc compress and decompress operations or not. This defaults to False.

Notes

Designed to be used with larger chunk sizes and a ThreadPool. There is a small performance penalty with releasing the GIL that will more harshly penalize small block sizes.

Examples

>>> oldReleaseState = blosc.set_releasegil(True)