Byte Shuffle Filter

The shuffle filter rearranges the bytes of a multi-byte array to improve compression. It is the same algorithm used by HDF5 and NetCDF4.

Why Shuffle Helps

For float32 data, each value occupies 4 bytes. The bytes within a float are not independent — nearby values tend to share their most-significant bytes (exponent + high mantissa) while the least-significant bytes are more random.

Without shuffle, the bytes are interleaved:

[B0 B1 B2 B3][B0 B1 B2 B3][B0 B1 B2 B3]...

A compressor sees B0 B1 B2 B3 B0 B1 B2 B3 B0 B1 B2 B3 ... — not very compressible because the predictable (B0, B1) bytes are mixed with the random (B3) bytes.

After shuffle, all byte-0s come first, then all byte-1s, etc.:

[B0 B0 B0 ...][B1 B1 B1 ...][B2 B2 B2 ...][B3 B3 B3 ...]

Now the B0 run and B1 run are highly compressible (long runs of similar values). The B3 run is still noisy, but it’s isolated. Overall compression improves significantly.

API

shuffle

#![allow(unused)]
fn main() {
pub fn shuffle(data: &[u8], element_size: usize) -> Result<Vec<u8>, ShuffleError>
}

Rearranges bytes. element_size is the byte width of each element (e.g. 4 for float32, 8 for float64).

#![allow(unused)]
fn main() {
let floats: Vec<f32> = vec![1.0, 2.0, 3.0, 4.0];
let raw: Vec<u8> = floats.iter().flat_map(|f| f.to_ne_bytes()).collect();
let shuffled = shuffle(&raw, 4)?;
// shuffled is ready for compression
}

unshuffle

#![allow(unused)]
fn main() {
pub fn unshuffle(data: &[u8], element_size: usize) -> Result<Vec<u8>, ShuffleError>
}

Reverses the shuffle. Applied automatically by the decode pipeline.

Using Shuffle in a Message

Set filter: "shuffle" in the DataObjectDescriptor and provide shuffle_element_size:

#![allow(unused)]
fn main() {
use ciborium::Value;

let mut params = BTreeMap::new();
params.insert(
    "shuffle_element_size".to_string(),
    Value::Integer(4.into()), // 4 bytes per float32
);

let desc = DataObjectDescriptor {
    obj_type: "ntensor".to_string(),
    ndim: 1,
    shape: vec![100],
    strides: vec![1],
    dtype: Dtype::Float32,
    byte_order: ByteOrder::Big,
    encoding: "none".to_string(),
    filter: "shuffle".to_string(),
    compression: "none".to_string(),
    masks: None,
    params,
};
}

Edge Cases

Element Size Must Divide the Buffer

The shuffle operation requires data.len() % element_size == 0. If this is not true, the function returns Err(ShuffleError::Misaligned). Ensure your data buffer is a whole number of elements.

Shuffle Alone Does Not Compress

Shuffle rearranges bytes but does not reduce the total byte count. It only helps when followed by a compression stage (e.g. szip, zstd, lz4, blosc2). Set compression in the descriptor to apply compression after the shuffle step.

Combining with simple_packing

When using both encoding: "simple_packing" and filter: "shuffle", the pipeline applies them in order: encode first, then shuffle. The simple_packing output is 1-byte-per-packed-chunk (MSB-first bits), so shuffle_element_size should be 1 in this case (no benefit from shuffling already-packed data). In practice, the combination is unusual — either use simple_packing alone (when quantising float values) or shuffle alone (before a lossless compressor).

Keyboard shortcuts

Tensogram