Skip to content

Implementing ML like language

Alex edited this page Mar 12, 2018 · 12 revisions

Simon Peyton Jones's book The Implementation of Functional Programming Languages is worth reading, it discusses implementing an ML-like programming language, compiling it to lambda calculus, then discusses optimizations. CAML Light realizes aspects of the book, compiling to the Categorical Abstract Machine (CAM). It might be worth while to look at this in greater detail.

Algebraic Data Types

How it seems to work under the hood is:

data Tree = Empty
          | Leaf Int
          | Node Tree Tree

depth :: Tree -> Int
depth Empty = 0
depth (Leaf n) = 1
depth (Node l r) = 1 + max (depth l) (depth r)

Loosely speaking, this is "the same" as the following code:

typedef struct tree tree;

enum {Tree_Empty, Tree_Leaf, Tree_Node} Tree_constructor;

typedef struct {} Empty;
typedef struct { int leaf; } Leaf;
typedef struct { tree *t1; tree *t2; } Node;

struct Tree {
    enum Tree_constructor constructor;
    union {
        Empty *empty;
        Leaf *leaf;
        Node *node;
    }
};

int depth(struct Tree *tree) {
    switch(tree->constructor) {
        case Tree_Empty: return 0;
        case Tree_Leaf:  return 1;
        case Tree_Node:  return 1 + max(depth(tree->t1), depth(tree->t2));
    }
}

Or something like that. Also see:

Abstract Machines

Apparently, the approach many languages take is to implement an "abstract machine" like an extension of the SECD, and compile down to that. GHC's abstract machine is the STG — spineless Tagless G-Machine.

Haskell's Virtual Machine: The STG

In its bare essentials, the STG machine consists of three parts:

  1. The STG registers:
    • There are rather a lot of registers here: more than can be practicably stored in actual available processor registers on most architectures.
    • To deal with the lack of processor registers, most of the STG registers are actually kept on the stack in a block of memory pointed to by a special STG register called the "base register" (or BaseReg). To get or set values of registers which are not kept in processor registers, the STG machine generates an instruction to load or store from an address relative to the BaseReg.
    • The most important four registers are the BaseReg, the stack pointer (Sp), the heap pointer (Hp), and the general purpose register R1 which is used for intermediate values, as well as for returning evaluated values when unwinding the stack. These are the four registers which are assigned actual processor registers when implementing the STG machine on x86.
  2. The STG stack:
    • Stores function arguments and continuations (i.e. the stack frames which are executed when a function returns)
    • Grows downwards in memory
    • The top of the stack is pointed to by the STG register Sp, and the maximum available stack pointer is stored in SpLim. There is no frame pointer.
  3. The heap:
    • Used to store many different sorts of heap object: notably functions, thunks and data constructors
    • Grows upwards in memory, towards the stack
    • All allocation occurs using a bump-allocator: the heap pointer is simply incremented by the number of bytes desired (subject to to a check that this does not exhaust available memory). The garbage collector is responsible for moving objects out of the area of the heap managed by the bump allocator and into the care of its generational collector.
    • The last address in the bump-allocated part of the heap that has been used is pointed to by the STG register Hp, with HpLim holding the maximum address available for bump-allocation.

This actually poses a challenge when using the LLVM as a backend for the GHC (from blogpost):

GHC defines an abstract machine that implements the execution model for Haskell code. This is called the 'STG Machine' (or Spineless Tagless G-Machine) and its job is to evaluate the final, functional IR used by GHC, STG. The STG machine consists of three main parts, registers, a stack and a heap. For this stack, GHC doesn't use the standard C stack but implements its own. What we are concerned with though is just how the registers are implemented. The easiest method is to just store them all in memory as a structure and indeed GHC supports this method (its refereed to as 'unregistered mode' and is used for easier porting of GHC). However because of how often they are accessed a far more efficient way to implement them is to map them onto real hardware registers, which roughly halves the runtime of a typical Haskell program. So this is what GHC does, although as there are far too many STG machine registers to map onto real registers, it has to still store some of them in memory.

This is a problem for the LLVM backend though as it has no control over the register allocation. We can still create a working backend by only supporting 'unregistered mode' but this isn't very useful due to the poor performance. Also we aren't just focused on performance, compatibility with the other backends is a major concern. We need to support the same register mapping as they use so that Haskell code compiled by LLVM will be able to link with code compiled by the other backends. Lets look quickly at how the other backends achieve this register mapping.

With the native code generator its very straight forward since it has full control over the register allocation. How about the C backend though? Typically this would be a problem for C as well since it offers no control over register allocation. Thankfully GCC offers an extension, 'Global Register Variables', which allows you to assign a global variable to always reside in a specific hardware register. GCC implements this feature basically by removing the register specified from the list of known registers that its register allocator uses.

So the solution for LLVM is a new calling convention but how does this work? Well the calling convention passes arguments in the hardware registers that GHC expects to find the STG machine registers in. So on entry to any function they're in the correct place. Unlike with the NCG or C backend this doesn't exclusively reserve the registers, so in the middle of a function we can't guarantee that the STG machine registers will still be in the hardware registers, they may have been spilled to the stack. This is fine, in fact its an improvement. It allows LLVM to generate the most efficient register allocation, having more registers and flexibility than the other backends, while still maintaining compatibility with them since on any function entry the STG machine registers are guaranteed to be in the correct hardware registers.

Reading List

Some random links on stuff related to Haskell GHC-implementation-details.

Apparently there is a core component to GHC, which is some flavor of typed lambda calculus (System FC, a proper superset of system F). See:

A random interesting read to think about is the LLVM Language Reference.

Clone this wiki locally