Two for the Price of One: A Model for Parallel and Incremental Computation
Created on 2023-11-02T01:05:26-05:00
Basically you break code in to tasklets. Fork/Joins break part of a job in to a new tasklet, with a name based on its parent. So task "t" creates revisions "t.1" and "t.2". Those may rip out their own tasklets which would be "t.1.1" and so on. Each tasklet records protected values it reads and writes to. The result of such a tasklet is a Future that yields a summary tree. Joining a future ensures the contracts are still valid (dependent values are what the tasklet expected) and if so, just copies out computed values. Otherwise the code has to be run again (serially.)
Even if a tasklet is re-executed, its forks might be skippable. As long as the revisions are the same then the summary tree can be checked and the join replayed without computation.
- --
Fork: split in to parallel timelines at this point
Join: force two timelines to merge
Summary tree: holds the list of dependencies and outcomes across protected resources within a given recording. While recording any reads from a protected resource logs the value (at read time) and any writes log the outcome.
Dependency: a value assertion which must be true for a summary to be correct. A dependency may be internal or external.
External: a value which must be, before recording began
Internal: a value which must be, as set earlier in the recording
External Validation means to check that all external contracts for a summary tree still holds. If they do not, then the summary and all parents are invalidated.
When re-executing, forks within that path are still cached. Look up the summary tree for that point and if all the contracts hold--congrats, just assign values to the recorded outputs and you're done.
Internal Validation checks values which were written to prior to a given summary tree. Instead of invalidating parent trees, you must invalidate any sibling trees (to avoid violating causality.)