Pker.xyz

Delta Coding Techniques

Introduction

Delta Coding is the idea of encoding the difference (diff/delta) between two instances of something instead of encoding the "whole" value. It can be seen as a form of lossless compression mechanism, where we have two values: a base value (sometimes called a previous value) and a new incoming value. The objective is to generate an encoding using the base value and the new value and to be able to restore the new value using the encoding and the base value. We therefore always require the base value to both encode/decode, the "new" value is what gets encoded.
In this article we will go over some examples, ideas and some actual implementations.

Some programs might be running some form of computation every n seconds. Instead of storing the result of every computation, we could store the difference between the most recent computation (new value) and a base computation (base value), resulting in data that is potentially smaller (this is explained in more details below). This would be the encoding part. We could then restore the most recent value by doing the opposite, which would be the decoding part.

Another example might be some form of streaming software where frames are computed and sent over the wire at a certain rate. Instead of sending the full frame every time, we could send the delta between the previous frame and the current frame. This could potentially reduce network traffic by a lot. The receiving end could then restore the value from the delta and the previous frame and so on. This also has some obvious downsides, one being that if a packet containing one frame is dropped (or if they arrive out of order, depending on the underlying protocol) then the following delta won't be restored properly on the receiving end. Another being that the sending side needs to keep a valid copy of the previous frame at all times. There's also the time it takes from generating the delta. As always there are tradeoffs.

Delta coding therefore has two phases: an encoding part (serialization), where we get the delta between two instances of a same type (or any data with some form of schema) and a decoding (deserialization) part, where we restore the value that was delta'ed using the base value. We therefore always need a non-encoded base value to encode/decode against, this is the interesting particularity, and also a potential downside for some use cases of delta coding.

The goal is typically to reduce network traffic, memory bandwith or storage size.

Fundamentals

Delta coding typically require the implementator to write encoders (serializers) and decoders (deserializers) for whatever needs to be delta encoded/decoded. To do so it's useful to have a mental model where we think purely in terms of information theory. Depending on how deep we want/need to go it's good to have an understanding of the binary representation of the data we want to delta encode.

The general idea is to obtain the smallest possible binary encoding for our data, ideally while keeping things as simple as possible. We might also need to consider endianess (it's quite easy to keep things endian agnostic) and to keep the cost of encoding/decoding our data to a minimum.

To do so we need to come up with a schema that for our encodings. Typically we want our encodings to be byte-even (or byte rounded). What I mean by this is that if the delta encoding of two values require 6 bits, we might have to round it to 8bits/1byte for storage or usage purposes.

The usual strategy is to generate a delta value that has the most leading or trailing 0 bits possible, so that we can simply shave them off. For example a 64 bit value representing the number 1 only requires 1 actual bit of information, if that bit is on the right most side (reading from left to right), we would say it has 63 leading zeroes that we can remove (wasted space). The cpu often offers intrinsics for counting leading and trailing zeroes that we can leverage.

The encoders accept two instances of a type and output an array of bytes. The decoders take in the previous instance and the encoded byte array. It outputs the restored value.

pseudo-code - generic encoder/decoder signatures
Encoder(newInstance: T, previousInstance: T) -> byte[]

Decoder(encoding: byte[], previousInstance: T) -> T
            

In the above snippet the T that the decoder returns has the value of "newInstance" that the Encoder() fn takes in, effectively restoring it.

For over the wire schemas we might also want to wrap our encodings or to prefix them with useful metadata about the encoding so that the decoders can have information about how to decode the delta and restore the value. This is particularily important for algebraic data types. We will see concrete examples that will shed light on all of this later.

32bits Integer Example

To get us warmed up let's say we have a system in which we are constantly generating a signed 32 bit value. We first need to define a schema that will represent the binary layout of our encoding. This schema is the contract between the encoder/decoder, it's effectively the protocol.

For the best delta encoding results, it's useful to have information about our values. For instance if we know that in our program the integer never has values that go over 100000, then we can design our schema with this in mind. This is very much related to my previous post about performance aware programming. For the sake of this example let's say we can't make any assumptions about the values of our data: we will need to make a generic schema.

generic-ish encoding schema for 32 bits signed integers
ENCODING: [value: nbits][delta or not:1bit][sign bit: 1bit]
            


The above encoding is quite simple: We either encode the bits that represent the delta (aka the substraction) between two integer values or we directly encode the bits that represent the value when it's not worth it. To inform the decoder of that disctinction we have a bit that is set to 1 when we encode the delta. We then have the right most bit which represents the sign.

The delta encoding itself will be a substraction between the two int values. Signed integers typically have a sign bit at the MSB (i.e on the left most end). Our goal is to pack everything so that our encoding can remove as many bits as possible from the value, we therefore encode the sign bit at the LSB, right after the value. For example:

Sign bit shifting example
The value -6 in a byte: 1000_0110 // notice the leading 1 (sign bit)
The value -6, sign bit at the left most position: 0000_1101
            


With this simple tactic we can now completely get rid of all the leading bits! We therefore went from requiring 8 bits (since we needed the sign bit which was at the right most position) to requiring 4 bits to hold the same information.

The other thing to notice in our encoding from above is that there is a bit reserved for "delta or not". This is because for some values doing the delta is not worth it. For example, if we have a previous value 124124124 and a new value of 3, it's much better to directly encode 3 than to encode the delta (the substraction) between 124124124 and 3 (which is 124124121) since the number 3 requires way less bits to represent.

Here's an implementation of the encoder and decoder we just described:

pseudo-code - 32 bit signed integer encoder/decoder
 
// encoding reminder: value of the delta or of the new value
// directly if not worth doing delta, 1bit delta or not, 1bit for the sign bit.

// Takes two signed 32bits integers and returns the binary encoding as an
// unsigned 32 bit integer, respecting the schema.
fn EncodeInt32(toEncode: i32, previous: i32) -> byte[] {
    // A single byte with values set to 0 means toEncode and previous are equal.
    if (toEncode == previous) { return {0b_0000_0000}; }

    delta = previous - toEncode;

    // Encode the bit that says if the encoding holds a delta or not.
    // set it to 0 (not holding a delta) by default.
    delta <<= 1;

    // worth it (delta requires less bit to represent than the value to encode) : set the bit to 1
    if (Math.Abs(delta) < Math.Abs(toEncode)) {
        delta |= 1 << 0
    }

    // if the resulting delta is negative, set the sign bit (lsb) to 1.
    if (delta < 0) {
        return (Math.abs(delta) << 1 | 1);
    }
    // else set it to 0
    return delta << 1;
}

// Takes the encoding and the previous value and returns the restored value.
DecodeInt32(toDecode: byte[], previous: i32) -> i32 {

    // indicates they previous/new are the same, as seen in the encoder.
    if (toDecode[0] == 0) return previous;

    const byteSize = 8;

    // shift the encoded byte array into an integer variable, so that it
    // can be decoded.
    u64 bitmap = 0;
    for (int i = toDecode.length - 1; i >= 0; i--) {
        bitmap = (bitmap << byteSize | (toDecode[i] & 0xFF));
    }

    // control bits are the "delta or not" bit and the sign bit
    const nbControlBits = 2;

    // introduce a new var that represents the encoding without the ctrl bits.
    encodedValue = (bitmap >> nbControlBits);

    // The "delta or not bit" (which is the second right-most bit) is 1,
    // meaning we are dealing with a delta.
    if ((bitmap & 0b_0000_0010) == 1) {
        // Sign bit (right-most bit) is 1, meaning delta is negative.
        if ((bitmap & 0x01) == 1) {
            return previous + encodedValue;
        }
        // delta is positive.
        return previous - encodedValue;
    }

    // we are not dealing with a delta. Simply check if negative or not

    // Sign bit is 1, meaning the value is negative (convert it).
    if ((toDecode & 0x01) == 1) {
        return 0 - encodedValue;
    }
    // Sign bit is 0, return the value directly.
    return encodedValue;
}
            


Surely this all seems a bit complicated to simply save a few bits. It's probably not worth it to do all of this for 32 bits integers in most cases. With this technique in the best case (when the values are the same), the encoding requires a single byte. The interesting part happens when we are dealing with more complex types or with collections. But first let's look at the two top level approaches.

The Simple Approach

The simplest way to do delta coding is a form of "fake" delta encoding, where we never actually compute the delta between two values. We simply look at the "new" value and if it's the same as the previous one, we say it's "0" (or have an invalid value that represent this state). If it's different, then we simply have the new value. That way we only save space when the values are the same.

trivial approach - no changes
previous value = 5; new value = 5; encoding = 0;

The decoder see's 0, and therefore knows the value was 5. 0 requires less bits to represent than 5, therefore a win. (Bare with me and ignore to caveats (like not being able to represent the actual value 0) for now).

trivial approach - change
previous value = 5; new value = 15; encoding = 15;

The decoder see's it's not 0, and therefore knows the "updated" value is 15. With this approach we only save space when the previous value and the new value are the exact same. If this happens often-ish this is a very good solution since it has extremely low complexity.

This is also often the best solution for many types. For example strings are notoriously bad for these sorts of manipulations, it's often simply better to have an encoding where when two strings are the same we output 0, and when they are different we output the most recent string. Maps (k/v) are also notiriously bad since they are not contiguous in memory by design: we therefore often need to keep the key intact in the encoding, which takes space. The simple method is to design an encoding schema where the key is always present followed or prefixed by a bit (or byte) that says if the value is the same as it previously was for that key or if it's delta encoded (or simply encoded as is, without any modifications/delta). Obviously this depend on the types of the key/value. A k/v pair where the key is a primitive or a string and the value is a more complex type can get a lot of benefit from using delta encoding between the different values of a given key.

The Interesting Approach

This is the "legit"/actual approach, where we defined schemas and build encoders/decoders that are able to generate a binary delta between two instances of our data. Our signed 32 bits integer example from above used this approach. Let's see more examples below.

Example: Array of 32 bit integers

This is where we can start to see enormous compression gains. We want to encode arbitrary sized arrays of 32 bits signed integers. The decoders will need to know how many elements it has to decode, and we need a way to know how to delta, since there are many values. The simplest way is probably to encode via index. Array1[5] will be delta'ed with Array2[5], this makes it easier for the encoder/decoder without having to make assumptions about the data. If we could make assumptions we absolutely should, but let's keep things generic here.

As previously displayed in the 32 bit integer example above, a good way to tell the decoder that a given value is the same as it previously was is to encode a 0. In that example, we needed things to be byte rounded, since we were encoding a single int directly and wanted to manipulate it (send it over the wire, decode it, etc..). With an array of integers (or of any types for that matter), every single int within the array can be encoded in arbitrary bit lengths, as long as the whole array is encoded as a valid byte array. This means that a delta between two numbers that result in the number 6 would not need to be encoded using 1 byte, it could be encoded using 3 bits only, as long as the top level decoder knows and is able to restore the number properly. This gives us way more opportunities to optimize if we so desire.

We will use the following strategy to encode two arrays of integers:

  • if the two arrays are the same, we will return a single 0 byte.
  • We will encode the number of elements that the "new" array (the one we want to encode/decode) has in a 32bits(4 bytes) header.
  • We will also encode the total number of bytes of the encoding, this is so that our encoder can be used for algebraic data types. This is good practice so that we can have types with multiple fields that are delta encoded/decoded. The decoder can know how many bytes of an encoded byte array are for which field, allowing proper decoding. This information can only be known after the encoding is done: we can't know how many bytes the encoding requires before encoding.
  • Encode via indexes (array1[1] will be encoded against array2[1]).
  • Use a concept called "change bytes" where we hold information about the next 8 values. A bit set to 0 in the change byte means the value at the current index is the same between the two arrays. A bit set to 1 means they are different and therefore delta encoded.
  • We will also need to know the size of the encoding of each individual delta, so that the decoder knows how to interpret things. We will therefore define encoders/decoders for ints that contain a header with this information.
  • Let's first define two helper functions to go from bytes-int, these will be useful. Let's also define an encoder/decoder for i32's that returns a u64 instead of a byte[]. This is because we don't need a byte rounded encoding per i32 since we are encoding arrays as explained earlier.

    pseudo-code - Helper fn's
     
    fn IntToBytes(toEncode: i32) -> byte[] {
        byte[4] bytes;
        bytes[0] = number & 0xFF;
        bytes[1] = (number >> 8) & 0xFF;
        bytes[2] = (number >> 16) & 0xFF;
        bytes[3] = (number >> 24) & 0xFF;
        return bytes;
    }
    
    BytesToInt(toDecode: byte[]) -> i32 {
        return (bytes[0] & 0xFF)
            | ((bytes[1] & 0xFF) << 8)
            | ((bytes[2] & 0xFF) << 16)
            | ((bytes[3] & 0xFF) << 24);
    }
    
    EncodeInt32Raw(toEncode: i32, previous: i32) -> i64 {
        // as always, when they are the same return 0.
        if (toEncode == previous) { return 0; }
    
        i64 delta = previous - toEncode;
        // if delta is negative shift sign bit to right most position.
        if (delta < 0) {
            return (Math.abs(delta) << 1 | 1);
        }
    
        // if delta is positive, sign bit is 0.
        return delta << 1;
    }
    
    // other way around. Gets us the i32 back from the encoding and the base.
    DecodeInt32Raw(toDecode: i64, previous: i32) -> i32 {
        if (toDecode == 0) { return 0; }
    
        // remove the sign bit.
        delta = toDecode >> 1;
    
        // Sign bit is 1, meaning delta is negative.
        if ((toDecode & 0x01) == 1) {
            return previous + delta;
        }
        return previous - delta;
    }
                

    Here's the schema we will use for our array/list of int's encoding:

    Schema - array of 32 bits integer encoding
    [HEADER][SEQUENCE]
    HEADER -> [32bits: number of encoded items][32bits: total nb of bytes for the encoding].
    SEQUENCE -> [[change byte][x: encodings], [next change byte..], ...,  n]
                


    The first 8 bytes represent the header which has two 4 bytes sections: the number of encoded items and the total nb of bytes for the encoding.
    This is followed by the "sequence" (the encoding of the elements). Which follows the pattern: change byte, encodings, change byte, encodings.. Where every bit of every change byte represents if the next value to decode changed from the base array to the new array or not. The encodings will contain a header for each value, which tells us how many bits are required for the encoding of the value.

    Encoder:

    pseudo-code - Encoder for arrays of i32's
     
    // NOTE: this is not production code at all and is meant as an example.
    // Porting it to your favorite language will result in a fully functional encoder though.
    fn EncodeIntArray(toEncode: i32[], previous: i32[]) -> byte[] {
    
        // no changes, decoder needs to check for single byte to decode.
        if (toEncode == previous) { return {0b_0000_0000}; }
    
        // use helper fn to get the nb of items in the array to encode as an int.
        byte[] header = IntToBytes(toEncode.length);
    
        // dynamic array that will contain the encoding. (in real scenario have a
        // heuristic that initiates with a capacity to avoid too much realloc).
        byte[] encodedList;
    
        // The first 4 bytes of the encoding is the first part of the header:
        // the number of items to encode.
        encodedList[0] = header[0];
        encodedList[1] = header[1];
        encodedList[2] = header[2];
        encodedList[3] = header[3];
    
        // the next (and last) part of the header will be known at the end. Declare it now.
        byte[] subHeader = byte[4];
        encodedList[4] = 0;
        encodedList[5] = 0;
        encodedList[6] = 0;
        encodedList[7] = 0;
    
        // The "sequence" part of the schema starts here, at index 8 (0-7 is header
        // defined above).
        changeByteIndex = 8; // the first change byte is at index 8, after header.
        encodedList[8] = 0; // set first change byte to 0
    
        // Keeps track of the index of the current bit in the current changeByte.
        changeByteBitIndex = 7; // last bit in byte from pos 0, we go from left to right
    
        // Iterate over all the elements of the "new" array we need to encode.
        for (int i = 0; i < toEncode.length; i++) {
    
            // When true this means the next byte is the next changebyte, we
            // therefore reset.
            if (changeByteBitIndex < 2 || changeByteBitIndex > 7) {
                changeByteBitIndex = 7;
                encodedList.push(0); // start next change byte with a value of 0.
                changeByteIndex = encodedList.length - 1;
            }
    
            // this will contain the encoding for the value we are currently encoding.
            encodedInt = 0;
    
            // Only encode with previous when possible (to protect from when the base array (previous
            // array) has less elements than the new one we are trying to encode).
            if (i < previous.length) {
                // Use our "raw" int32 encoder which returns an int.
                encodedInt = EncodeInt32Raw(toEncode[i]), previous.[i]);
    
                // previous[i] == toEncode[i], set/leave current bit of changeByte
                // is 0 and move on to the next iteration. This is the NO CHANGE scenario
                if (encodedInt == 0) {
                    changeByteBitIndex--;
                    continue;
                }
            }
            // the new int array (toEncode) is longer than the base (previous) array
            // and we reached an index that exists in toEncode but not in previous.
            // This means we can't delta by index, we need to encode the int as is.
            else {
                // When we cant delta we still want to encode the sign bit at lsb to save on bits.
                // Leverage our int32 raw encoder for this, encoding against 0.
                encodedInt = EncodeInt32Raw(0, toEncode.[i]);
            }
    
            // there is a change: set corresponding bit in changeByteIndex to 1
            updatedByteIdx = encodedList[changeByteIndex] | 0x01 << (changeByteBitIndex) & 0xFF);
            encodedList[changeByteIndex] = updatedByteIdx;
            changeByteBitIndex = changeByteBitIndex - 1; // move down by 1.
    
            // Get the even nb of bytes required to hold the encoded int.
            nbBits = CpuIntrinsics.numberOfLeadingZeros(encodedInt);
            nbBytesRequired = ((32 - nbBits) + 7) / 8;
    
            // this is to encode the nb of bytes required (1-5) in 4 bits. 00 == 1, 01 == 2...
            nbBytesRequiredEncoded = nbBytesRequired - 1;
    
            // encode nb of bytes required for the encoded int in change byte (-1 because shifting "2" bits)
            updatedByteIdx = (encodedList[changeByteIndex] |
                    (nbBytesRequiredEncoded << (changeByteBitIndex - 1)) & 0xFF);
    
            // If you dont fully grasp the above, just know that what we were doing
            // is setting the bits in the changeByte that represents the size of the
            // encoding for the current int we are encoding.
            encodedList[changeByteIndex] = updatedByteIdx;
    
            // -2 because above we used 2 bits to describe the nb of bytes required.
            changeByteBitIndex = changeByteBitIndex - 2;
    
            // finally, encoding the value.
            nextByteValue = 0;
            idx = nbBytesRequired - 1;
            for (int j = 0; j < nbBytesRequired; j++) {
                nextByteValue = (encodedInt >> (8 * idx)) & 0xFF;
                encodedList.push(nextByte);
                idx = idx - 1;
            }
        }
        // We have iterated over all the ints of the "new" array that we want to
        // encode. We can now finish writing the header and return.
    
        subHeader = IntToBytes(encodedList.size());
        encodedList[4] = subHeader[0];
        encodedList[5] = subHeader[1];
        encodedList[6] = subHeader[2];
        encodedList[7] = subHeader[3];
        return encodedList;
    }
                

    phew! we got it. Hopefully that makes sense, I understand it can be a bit of an headache. Now for the decoder...

    pseudo-code - Decoder for arrays of i32's
     
    // Takes in the encoding (byte[]) and the base i32[]. Returns our initially encoded i32[].
    fn DecodeIntArray(toDecode: byte[], previous: i32[]) -> i32[] {
    
        // no changes! new == base. So return base (previous).
        if (toDecode.length == 1 && toDecode[0] == 0) return previous;
    
        // get the first section of the header, to know how many elements we need to decode.
        header = BytesToInt(toDecode[0..3]);
        i32[] decodedList = i32[header]; // init with exactly the amount we'll need.
    
        // the first change byte starts at 8 (after the header) as we know.
        changeByte = toDecode[8];
    
        // First encoding index is unknown at this point
        // (but we know that it's at least at changeByteIndex + 1)
        index = changeByteIndex + 1;
        changeByteBitIndex = 7; // last bit in byte from pos 0
    
        // iterate "header" times: the exact nb of items we will decode.
        for (int i = 0; i < header; i++) {
    
            // hop to next change byte if needed.
            if (changeByteBitIndex < 2 || changeByteBitIndex > 7) {
                changeByteBitIndex = 7;
    
                // this implies the past 8 values were all encoded as 0s (so all the
                // same between the new array and the base array during encoding) in the
                // changeByte itself, the index therefore did not move.
                if (changeByteIndex == index) {
                    changeByteIndex = index + 1;
                } else {
                    changeByteIndex = index;
                }
    
                changeByte = toDecode[changeByteIndex];
                index = index + 1;
            }
    
            currentChangeBit = (changeByte >> changeByteBitIndex & 0x01) & 0xFF;
            nbBytesRequired = 0;
    
            // implies there was a change, read the pseudo header at current
            // position in the change byte which tells us the length of the encoding.
            if (currentChangeBit != 0) {
    
                // When a change bit is set to 1, get the next 2 bits in the
                // change byte which inform about the width of the
                // encoding. Do +1 since 00 == 1, 01 == 2..
                nbBytesRequired = (changeByte >> (changeByteBitIndex - 2)) & 0xFF) & 0b0000_0011;
                nbBytesRequired = nbBytesRequired + 1;
    
                changeByteBitIndex = changeByteBitIndex - 3;
            }
    
            // will contain the encoding for the current value we try to decode.
            encoded = 0;
    
            // situation where there's going to be a delta (this index exists in
            both the new array and the base/previous array).
            if (i < previous.length) {
                // no change, so simply add what was in the base/previous array at
                index i in the new array
                if (currentChangeBit == 0) {
                    decodedList.push(previous[i]);
                    changeByteBitIndex--;
                    continue;
                }
                for (int j = 0; j < nbBytesRequired; j++) {
                    encoded = encoded << 8 | (toDecode[index]) & 0xFF;
                    index = index + 1;
                }
                decodedList.push(DecodeInt32Raw(encoded, previous.[i]));
                continue;
            }
    
            for (int j = 0; j < nbBytesRequired; j++) {
                encoded = encoded << 8 | (toDecode[index]) & 0xFF;
                index = index + 1;
            }
    
            // remove encoded sign bit
            delta = encoded >> 1;
            if ((encoded & 0x01) == 1) {
                // nb is negative
                delta = delta * -1;
            }
            decodedList.push(delta);
        }
    
        return decodedList;
    }
                

    Alright we got it. An encoder and decoder for i32[]!



    Strategy for algebraic data types (structs/classes)

    // todo!();

    This article is getting quite long. A part 2 will come at some point.


    Uploaded 09-11-2024