The patience diff algorithm

Created on 2021-10-21T11:27:16-05:00

Basically a way of pre-processing a text document for difference analysis; by anchoring the sections around fixed points the patches will be more human readable and cleaner in the face of more awkward inputs.

Consume all lines from the front of a block that match on left and right

Consume all lines from back of a block that match on left and right

Collect lines which appear only once on both left and right

Run Least Common Subsequence on the list of unique lines to "line them up"

Iterate over each block between the unique lines and repeat step 1 and 2 to shrink them.

Any longest increasing subsequence algorithm will do; but the algorithm is named from the Patience Sort.

Patience Sort
Bram Cohen's original blog post on patience diff