Comments

This booklet is a work in progress. It is partly inspired by a fae mood and the article Win32 Is The Only Stable ABI on Linux. Ideas are pulled from the defunct Our Machinery game engine's tech talks about plugins, experience with LV2 plugin development, Khronos APIs, Windows, and personal bruh moments.

This booklet should help you understand how to design your libraries so they will "never break" and, stars willing, continue to run code developed twenty years ago that never dreamed of what machines do now (whatever that happens to be.)

There is an iceworks-devel mailing list on sr.ht where you can e-mail comments about this book.

Changelog

This version of the book is based upon v0.0.3.

All notable changes to this project will be documented in this file.

[0.0.3] - 2023-07-26

Features

  • Mention deprecate, soft and hard removal in a chapter separate to semver
  • Case study of how CLAP uses factories

Miscellaneous Tasks

  • One sentence per line in factories.md

Refactor

  • Rewrite description of factories

[0.0.2] - 2022-08-31

Bug Fixes

  • Include release version in the book's changelog

Features

  • Add changelog to the book's front matter
  • Versioning of the symbol names themselves

Miscellaneous Tasks

  • Task runner to generate changelog prior to building the book

[0.0.1] - 2022-08-31

Features

  • Wrote the entire first draft mostly from memory

Miscellaneous Tasks

  • Add git-cliff config for future changelog shenanigans

Semantic Versioning

Semver is a structured practice for versioning software. In short:

  • Major versions denote breaking ABI changes.
  • Minor versions denote compatible ABI changes.
  • Patches denote changes such as bugfixes that can be slipped in place.

Software is generally expected to always work against the same major version with the exception of major version zero. "Version zero" is reserved for software that carry no stability guarantees.

Behaviors have to continue to be supported in perpetuity until a major version change which signals a hard break is allowed. Until then behaviors can be "soft removed" so new applications cannot use it but old applications get the compatibility path.

A major version may perform a hard break. A hard break means all of the old code is removed and the codebase can "break free" from cruft. In this case a fork of the old version must be maintained in order to run old software.

A major version may choose to mask all of the old symbols. The SDK only provides the new major version while old symbols are implemented for compatibility but completely ripped so new applications cannot use it. Versioned symbols is a useful concept for this.

Stable vs Unstable Versions

Sometimes a major or minor version is used to indicate stability. For example odd minor versions may mean something is an unstable version which is still in flux. Such a dependency would best be either unused, used with a fallback to the stable, or vendored. TBD we should probably officially recommend this here since its a very common convention (ex. used by Linux kernel.)

Removal Process

Sometimes you get public APIs wrong. Sometimes technology radically changes from underneath you and you need to change things. You need a removal process to handle behaviors gracefully exiting from the codebase.

  • Software which has already been built may continue to use symbols which have been soft-removed.
  • New software is at first warned to stop using behavior
  • A later version removes it from the SDK.

Deprecation: An entity is marked as "deprecated." Documentation is adjusted to mention what should be done instead. Warnings are put in place to discourage continuing to use the deprecated entity. Deprecations remain for one or more minor releases.

Soft-removal: An entity is rendered inaccessible in the SDK. Continued use requires using new identifiers that make it painfully obvious you are doing it wrong. Identifiers may be fully removed. Old code continues to work but new code is rendered unable to use the deprecated objects. A soft removal comes with a minor bump of the library's version.

Hard-removal: An entity is removed entirely. It no longer exists in the SDK or the underlying library. A hard removal comes with a major bump of the library's version. A library performing a hard removal will need to keep a fork of the older version around for old software to remain compatible.

Deprecation Time

Generally something being removed should be marked deprecated for one or more releases. Developers need time to hear of the notice, replacement, and make suitable adjustments (if their software is still alive and will be updated.) If dead software needs to be kept around then the users will need to get workarounds in place.

Deprecated code is eventually killed. It may survive in compatibility layers inaccessible to new compiles or it may require a fork of the library to be kept around to run that particular old program.

However, quite some time should pass between marking for deprecation and removal of the features.

Constants

The identifiers of constants can be reused but the values of old constant can never be reused.

Reusing identifiers should likely be avoided outside of a soft-break. You could end up with confusion as new compiles ask for new behaviors they are not aware of or can handle.

Deprecation

Mark the identifier as deprecated in documentation. If plausible also emit warnings in runtime or compile logs that a symbol is being used that no longer should be.

(Soft Removal) Rename the constant or enum if your language cares about enum holes. Such as putting "_DEPRECATED" on the name. Alternatively you can remove the enum but you will need to make sure other elements retain the correct index.

This will hard break new compiles but not affect software in the wild.

/* soft deletion */
typedef enum _kind {
    ctMD5_Broken = 1, /* _Broken discourages use */
    ctGoodHash = 2,
    ctPostQuantumHasb = 3
} kind_e;

/* hard deletion */
typedef enum _kind {
    /* there is a hole here; but some languages hate that */
    ctGoodHash = 2,
    ctPostQuantumHash = 3
} kind_e;

Rationale

Constants are baked in at compile-time. So old software will always give the old values. New software can use the new values at the same identifiers.

Reserved Fields

A field should never be marked as reserved for future use unless the interface throws a hard error if it has a non-zero value.

typedef struct _thing {
    int value;
    void* reserved; /* probably a bad idea */
} thing_t;

void correct(Thing* self) {
    /* ensure unused values are unused somehow */
    if (self->reserved != NULL) abort();

    switch (value) {
        /* goes on about normal business */
    }
}

void incorrect(Thing* self) {
    switch (value) {
        /* goes on about normal business */
    }
}

Rationale

Some languages initialize all memory to zero. Others (C/C++) notoriously do not. Meaning any field not of interest to the developer making the call is likely to be filled with garbage. Therefore the program must force the developer to either clear them or use a practice of blanket clearing objects with something like bzero or memset.

If memory is reserved for internal use there is no problem. Just overwrite it--they were warned.

Note that a client can only clear bytes it knows about. If a struct has two lights and a new update brings it to four lights then it will only properly initialize two of those to zero. So new reserved fields can never be added that also enforce the fields do not contain junk memory. An exception to this is if the record is versioned so we can tell if there is a way the client could have behaved correctly.

Versioned Records

Implied Version

  • Just take sizeof(thing_t) as the version
  • It's what Microsoft does for 20+ years. x.version=sizeof(thing_t) mantra is easy.
typedef struct _thing {
    int version;
} thing_t;

thing_t something;
something.version = sizeof(thing_t);
do_it_live(something);

Explicit Version

  • Have a version field and always set it to the latest version (ex. a constant THING_CURRENT.)
  • Full freedom to remove and replace old fields.
  • Have to keep copies of old structures around forever though.
typedef struct _thing {
    int version;
    /* current version */
} current_thing_t;

typedef struct _thing {
    int version;
    /* fields when we changed some stuff */
} thing_v2_t;

typedef _thing {
    int version;
    /* fields from version one */
} thing_v1_t;

Entry points have to switch on the version then look up the runtime for the particular thing_v1_t, or, perform an upgrade procedure that brings that old version to current.

Getter/Setters

Get/set routines operate on a pointer or handle to an object. Typically they have a set type they operate on. There is also an enum which names each property which can be accessed through this generic interface.

void set_str(thing_t* self, int property, const char* neu);
const char* get_str(thing_t* self, int property);

Accessors are able to perform validation against the current version of an object. If a field is changed it can be upgraded if possible and otherwise silently or loudly failed.

Faceplate mechanism can be used to make these less obnoxious. For example:

static inline void thingbuilder_set_port(thingbuilder_t* self, int port) {
    thingbuilder_set_int(self, tbPort, port);
}

Factories

Factories are objects which create your objects. They are useful in a number of circumstances:

  • When your language is a poorly designed trash fire (chiefly C++), and compilers cannot agree on how creating an object is supposed to work.
  • When creating an object may require or allow additional context now or in the future.
  • When object creation needs to support versioning.

In some cases this is known as the Builder Pattern. Here is an example of a factory using the builder pattern:

thing_builder_t tb;
tb_start(tb);
tb_set_int(tb, tbPort, 8080); /* using a generic setter */

thing_t* obj = thing_create(tb);
/* fun stuff */
thing_destroy(obj);

Changing the parameters to a function can be a tricky proposition. However you can use the rest of the immortal ABI doctrine against the factory--since it's just an object with its own functions that can be added or removed without breaking anything.

We also have a tb_start function defined here. That could do any number of things:

  • Embed the size of thing_builder_t as we understand it at compile time (the Microsoft method)
  • Embed a serial number as we understand it at compile time
  • Embed a string identifier that we knew at compile time

Case Study: CLAP

The CLAP audio standard makes heavy use of the factory pattern and string identifiers. Factories are also used to produce other factories; with the exception of the "main" entry point to kick off access to the API.

It works a bit like this:

  1. You call an entry point function like clap_init
  2. You provide a string representing the version of the API your program speaks
  3. It returns nil or a reference to the plugin factory
  4. Further features repeat steps 2 and 3, but are requests made against the plugin factory instead.

Here is a trivial example in C:

notclap_t* = notclap_entry("com.example.not-clap/1");
if (notclap_t == NULL) goto failed;

notclap_cat_petter_t* =
   notclap_t->get_plugin(notclap_t, "com.example.not-clap.cat-petter/1");
/* ... */

return 0
failed:
fprintf(stderr, "");
return 1

This looks a bit verbose because we are using C code. Better languages (Nim) allow you to pretty this up.

Handles

Handles abstract the pointer over memory or network boundaries.

thingbuilder_t tb;
tb.version = sizeof(thingbuilder_t);
tb_set_port(&tb, 8080);
int t = thing_create(&tb);
/* do things  */
thing_destroy(t);

This completely separates the client from details of the object it is working with. Sometimes this has been used by C++ libraries that completely hide all object semantics (which are compiler-specific) and expose a handle interface via a C ABI (which is stable.)

This abstraction is extensively used by OpenGL and Khronos ABIs.

Multi-threading

The exact value of a handle is undefined. However there can be issues where multiple threads may be taking and returning handles. This can create contention managing the handles.

Our Machinery suggested a method where the total handle pool is split some number of ways such as one split per CPU. Each processor then pulls from it's specific pool. That way the majority of handle activity stays to that processor.

Similar tricks have been used by many memory allocation systems; pages set aside for particular threads so contention only happens where specific handoffs have to occur.

Interface Records

An interface record holds function pointers as its fields. Basically an entire interface is specified through function pointers in the struct rather than through a header file and called directly.

Since specific functions have to be placed inside the fields there are opportunities here:

  • Functions a client is not going to be allowed to call can be left nil.
  • An old client may get different functions that include workarounds or upgrade paths.
  • Functions can also be "wrapped." This is similar to Lisp's defadvice or Python decorators where a normal function call is wrapped within some custom behavior before and after the underlying call.
typedef struct _thingi {
    int version;
    void (*boop)(thing_t*, const char*);
} thing_i;

c.boop(t, "blobcat");

The interface is itself a versioned record which can rely on version tags or simple append-only design.

Our Machinery relied on an append-only design for plugins. Old plugins can only call older endpoints in the interface while newer ones have access to the longer interface. Since this record only holds function pointers there is no issue with the older versions being smaller.

LV2 relies on a manifest file to define what capabilities a plugin has. Each capability is then an interface record supplied once the host knows it should provide one. Plugins then use the interface records to coordinate how the host should communicate with them again to synthesize sound.

Manifests

Some external unit of data specifies what is required to run a module.

LV2 uses an RDF Graph to relate properties and features of a plugin to a host. The manifest specifies what kinds of behaviors (capabilities) the module expects and supports. The manifest also specifies how those features are configured. A host reads the manifest to figure out what interfaces to use to interact with the plugin.

Android OS has similar functionality. XML documents specify an API version and what permissions are required of a program. The manifest is used to determine what compatibility modules should be loaded. For example security systems will have to behave in ways that shim old un-updated programs in to working on newer concepts.

Versioned Symbols

Windows has a trick where ASCII and Unicode versions of functions have particular suffices. Macros then replace do_something with the underlying do_something_ascii or do_something_unicode.

Unlike records a function may be exposed to the outside world and the name may be used for all time. So we may end up stuck as Microsoft was having to add silly Ex or "2" versions of functions.

One option is to always include a version number in to every function and use a faceplate to remove it. So do_something is what we see in a program but its actually been redefined or symbol exported as do_something_v1.

Another option is all symbols are mangled to some identifier such as with NanoID and our symbols have names like FnLZfXLFzPPR4NNrgjlWDxn while we program with the friendly names.

When we perform a major API break we can reclaim old identifiers and constants. We just point all of the friendly names to new IDs while the old IDs are left in maintenance mode or become compatibility functions. Symbol tables will look like ass but those are not user-facing anyway.