Skip to content

Compilation: Transformation

Matt Basta edited this page Jan 18, 2015 · 1 revision

One of the most important parts of the BType compilation process is the transformation step. In order to be able to perform high-level optimizations on the code that's passed to the compiler, the code must be transformed into a very specific shape: there must be no nested functions whatsoever.

In order to achieve this, multiple complex considerations need to be made:

  • Functions that access lexical scope need to accept an object as a parameter that contains the variables that they access lexically. This object is known as the function context, which is of the funcctx type,
  • Functions that have their variables accessed lexically should move those variables into a custom object type. That object should be instantiated at the start of the function and passed to each sub-function that accesses lexical scope (even if the function is nested multiple levels deep). Of course, all references to the original variables must be changed into member expressions.
  • Lexically accessed variables must be differentiated from global variables. Global variables are allowed and should not be modified.

The exception to this case exists for functions that are accessed in a first-class way:

func int:foo(bool:x, int:y) {
    var first = func<int>() {return y + 2;};
    var second = func<int>() {return y * 3;};
    func<bool>:which = null;
    if (x) {
        which = first;
    } else {
        which = second;
    }
    return which();
}

The following function, when passed true or false for x, will return a modified version of y. Which modification is used (y + 2 or y * 3) is decided dynamically.

The reason this is hard from a compilation process is that y is accessed lexically by first and second, but there is no direct call to first or second. Without using static analysis, there's no way of knowing what variable contains a reference to either of those functions. In a more complicated example, a reference to either of those functions that escapes foo could have no function context object associated with it at all.

In order to make this work, we need to make sure a function context object is assigned to each reference as soon as possible. That's achieved by using a FunctionReference AST node. Its base member can be contain any Function node. It also contains a ctx member, which contains a reference to the function context object of the parent function.

When the nested functions are uplifted to the global scope (to satisfy the "no nested functions" requirement), the base member is replaced with a symbol referencing the function.

The thinking is that when the compiler encounters a FunctionContext node, it looks up the function associated with it. At that point, the compiler can either create a direct reference (if the compile target supports first-class functions) or create a more complicated system (like function tables in asm.js). In either case, if the ctx member is not null, it will be bound to the last parameter of the referenced function.

In the asm.js compile target, this is achieved by creating a function reference object. This object is composed of two 32-bit integers: one to store the index of the referenced function in the associated function table and another to store a pointer to the function context object. The function reference object is passed to a special trigger function that calls the target function in the function table.

Clone this wiki locally