The patience diff algorithm
Created on 2021-10-21T11:27:16-05:00
Basically a way of pre-processing a text document for difference analysis; by anchoring the sections around fixed points the patches will be more human readable and cleaner in the face of more awkward inputs.
Consume all lines from the front of a block that match on left and right
Consume all lines from back of a block that match on left and right
Collect lines which appear only once on both left and right
Run Least Common Subsequence on the list of unique lines to "line them up"
Iterate over each block between the unique lines and repeat step 1 and 2 to shrink them.
Any longest increasing subsequence algorithm will do; but the algorithm is named from the Patience Sort.