LZ is even better. Neither arithmetic nor Huffman will compress when the probability of all symbols is the same, but LZ will find repetitions easily. LZ also decompresses extremely quickly --- faster than memcpy is often mentioned.
> LZ is even better. Neither arithmetic nor Huffman will compress when the probability of all symbols is the same
Comparing LZ to arithmetic encoding is a category error. LZ and Huffman are combined modeling+encoding methods, while arithmetic is just an encoding method, and it can be combined with any modeling technique. Arithmetic plus a suitable modeling technique will achieve compression as good as LZ, Huffman, or any other scheme. The PAQ8 compressors, and I believe its successors in the Hutter Prize ranking, use arithmetic plus a very advanced modeling scheme.
Indeed, but worth noting that LZ is a modelling scheme, whilst Huffman is a coding technique.
That is, LZ determines, dynamically as it goes, what are all the elements we want to encode and their probabilities. Then you need a coder, like Huffman, to actually encode it.
In the post I used a semi-static zero-order byte-based model.
Which means I counted the byte occurrences first and just used that count for the probabilities throughout all of the encoding. Then I used Huffman codes to translate those probabilities into bits.
But I'm considering writing a follow-up changing this static model for an LZ77 one as I think that would be fun.