Musing on VCS storage formats

Created on 2023-05-11T14:00:19-05:00

Return to the Index

This card can also be read via Gemini.

One thing i worry about pijul is that its data structures are highly advanced. The logic for working with everything is similarly highly mathematical.

i worry slightly about having incredibly advanced formats for things like this :blobcatterrified:

pijul has a whole bespoke key/value store and complex commutation and resolving logic which is significantly more complicated than what a changeset amounts to.

Git

Git's pack files are not actually so special. Folders are collapsed in to a flat text file that tells you what other folders and files exist within here, and their IDs. You match those IDs to index and pack files to figure out which fragments of bits you need to reconstitute the directory. People can and have imitated this in other tools and it works fine.

Pack files are probably easier to pilfer from than SCCS weaves. You just need to go reach in at certain positions to reconstruct files. There is no complex connection of lines that are active and inactive--the "activation set" is a vector of "go here, take this many bytes" commands.

SCCS, BitKeeper

The SCCS Weave is freaky, but apparently its not actually that complicated. You have serial numbers which identify blocks of content. You pick a set of these serial numbers to be "active." Then to "get" a file you walk through the weave on a line by line basis and if the block is active you apply it to the file in progress. If a block "adds" to a file and is active then you start accumulating those lines to your output. If a block "removes" from a file then you omit those lines *if* the removal block is active. A surrounding "delete" block is only permitted to wipe out data with a lower serial number though--a guard that makes creating the weaves easier.

This is a weirdly fascinating format. Apparently, it holds up very well under the load of *text editing* tasks. Putting binaries in there is right out--though some people did horrible things like uuencode a file and shove it in a weave.

Doesn't really matter (Mercenaries)

i guess in a way it doesn't matter. this is finnicky permaculture concerns and in a mercenary context ain't nobody got time to make sure the history of projects are in some beautiful format with an indefinite lifespan :blobcatbean: