An Introduction to Memento (Simple C Debugging Library)
Created on 2025-02-28T10:33:30-06:00
- Put barriers around allocated memory and check their integrity sometimes
- Mark call site for alloc/deallocs
- Count alloc/deallocs with a serial number (event ID)
- Run tests every so many events
- When something goes wrong, repeat the same job with smaller and smaller test windows
Memento assumes a workflow is deterministic. You run the job again from scratch with more frequent test windows to pin down where a corruption event occurs.
Quotes
Memento consists of a single .c file, and a single .h file. It will work on any platform that supports standard C
As it runs, Memento will keep track of each block that is allocated and freed, and will periodically check those blocks for corruption.
The techniques used here are lo-tech:
malloc/free/realloc/calloc etc are intercepted, not using any clever compiler-specific techniques, but by using macros.
Every time one of those calls is made, Memento counts that as an ‘event’, and keeps track of the number of events that have passed so far.
When a block is allocated (or freed), the event number is stored with the history for that block.
Blocks are ‘over-allocated’, with extra guard bytes at the start and end. That space is filled with known values, so that under/overruns can be detected.
Freed blocks are kept around for a while, and are similarly filled with known values, so Memento can watch for ‘write after free’.
At the end of execution, Memento outputs some statistics about memory use (total allocations/frees/reallocs, peak usage etc), and will list any outstanding blocks.
So, at its very simplest, just running a Memento build will give you confidence that your code isn't leaking, and isn’t obviously overrunning/underrunning or otherwise corrupting memory that it shouldn’t be.
Well, reversible debuggers (like rrr and UndoDB) are a pretty new thing.
So if I run my program once, and block number 345 leaks, I know that if I run it again, block 345 will leak again.
Accordingly, I can run the program under a debugger and ask Memento to stop when it gets to the allocation I am interested in.
As the program runs Memento keeps a count of ‘events’ that have occurred (mallocs, reallocs, frees etc). Every so often, Memento will trigger a sanity check of all the memory it knows about.
This probably means that block 87 has had a buffer overrun. We know it was OK when the system last checked on event 240, but now (at event 347) it’s corrupt.
So, at some point between event 240 and event 347, someone has written somewhere they shouldn’t
Well, sadly, running through all the allocated memory (including the memory we keep on the free list) checking for corruption, turns out to be a bit too slow for us to do every time.
In order to allow us to tune this, we have a control, called the ‘paranoia’ level.
If we’re paranoid (paranoia = 1), then we run a check every single event. If we’re less paranoid (say, paranoia = 100), then we run a check every 100 events.
Accordingly, Memento can be told to impose a maximum limit in which it should work. This can either be done by setting the MEMENTO_MAXMEMORY environment variable to the amount in bytes, or by the same amount being passed to Memento using the Memento_setMax() call.
Memento can certainly do this, by setting the MEMENTO_FAILAT environment variable to the number of allocations at which we should start to fail before running the program each time.
When called with the MEMENTO_SQUEEZEAT environment variable set to n, the program will run through until it is about to perform the nth allocation. At this point, the program will fork(), producing 2 identical processes, which we’ll call the parent and the child.
In this way, the test of each successive operation only takes an incremental amount more, rather than having to repeat the entire run each time.
https://github.com/ArtifexSoftware/memento