Tree Compiler-Compiler

Created on 2023-10-04T17:31:18-05:00

Return to the Index

This card pertains to a resource available on the internet.

This card can also be read via Gemini.

Node: a container that has a type and some content fields.

Enums: a container has a type but no content fields.

A toolkit that seeks to solve the problem of making sure complex tree processors missing cases when the definition of a tree is changed. A special syntax allows writing implementation of a function with light pattern matching, checking that all cases are covered by the patterns, then combining the rules together in a unified interface.

If you create an enum with values Red, Green, and Blue, then define a "to_hex" behavior which accepts a color, you must then give a rule that covers each option and returns a hex code. If Magenta is later added to the enum the compiler will fail if there is neither a rule for Magenta nor some generic equivalence.

If a node/enum "inherits" from another node/enum, then a specific rule is allowed to be missing. The node/enum will be degraded to its parent equivalence until a rule to handle it can be found.

Operations and Triggers

An "operation" is defined with %operation. This includes the signature of the function and its parameters. Parameters might be abstract such as the base type of all nodes.

Triggers match the operation name to specifics about the parameters. This can be exact values of enums, or the type codes of nodes.

The compiler checks that all supplied trigger conditions cover all required elements. When a new element is added, the compiler will point out every location where a required consideration has not been made.

Example

A. Full expression example code

The full treecc input file for the expression example is as follows:

%enum type_code =
{
    int_type,
    float_type
}

%node expression %abstract %typedef =
{
    %nocreate type_code type = {int_type};
}

%node binary expression %abstract =
{
    expression *expr1;
    expression *expr2;
}

%node unary expression %abstract =
{
    expression *expr;
}

%node intnum expression =
{
    int num;
}

%node floatnum expression =
{
    float num;
}

%node plus binary
%node minus binary
%node multiply binary
%node divide binary
%node power binary
%node negate unary

%operation void infer_type(expression *e)

infer_type(binary)
{
    infer_type(e->expr1);
    infer_type(e->expr2);

    if(e->expr1->type == float_type || e->expr2->type == float_type)
    {
        e->type = float_type;
    }
    else
    {
        e->type = int_type;
    }
}

infer_type(unary)
{
    infer_type(e->expr);
    e->type = e->expr->type;
}

infer_type(intnum)
{
    e->type = int_type;
}

infer_type(floatnum)
{
    e->type = float_type;
}

infer_type(power)
{
    infer_type(e->expr1);
    infer_type(e->expr2);

    if(e->expr2->type != int_type)
    {
        error("second argument to `^' is not an integer");
    }

    e->type = e->expr1->type;
}

The full yacc grammar is as follows:

%union {
    expression *node;
    int         inum;
    float       fnum;
}

%token INT FLOAT

%type  expr
%type  INT
%type  FLOAT

%%

expr: INT               { $$ = intnum_create($1); }
    | FLOAT             { $$ = floatnum_create($1); }
    | '(' expr ')'      { $$ = $2; }
    | expr '+' expr     { $$ = plus_create($1, $3); }
    | expr '-' expr     { $$ = minus_create($1, $3); }
    | expr '*' expr     { $$ = multiply_create($1, $3); }
    | expr '/' expr     { $$ = divide_create($1, $3); }
    | expr '^' expr     { $$ = power_create($1, $3); }
    | '-' expr          { $$ = negate_create($2); }
    ;