Prediction by partial matching
Created on 2023-05-20T23:09:50-05:00
Looks at a window of symbols and counts the total number of times that window appears in the input.
For example PPM(4) looks at each four byte window and calculates the frequency of those four byte sections.
Proceed to use some kind of arithmetic coder that uses less bits to refer to more common sequences and more to refer to less common ones.
Unknown symbols
Some treat encountering an unknown symbol as a symbol itself.
Laplacian estimation assumes "never seen" has a probability of 1 encounter at all times.
PPMd increments a "never seen" counter for each novel symbol, learning a probability of finding something new that it doesn't know.