Skip to content

Memory Model

Felix Guo edited this page Jul 11, 2017 · 1 revision

The memory model's data structures can all be found in memory.h.

In WendyScript, memory operates on blocks of data called tokens (coincidentally the same type of tokens used during the scanning process). These tokens are defined in token.h and encompass both lexicographic tokens as well as memory tokens (should probably change that and distinguish between the two, but I'm too far in at this point...).

Tokens include a token type, as well as a data component. The data component is stores either a 1024 max character string, or a number of the double type. Future support for bignum will be coming.

The memory module consists of three primary parts:

  1. Generic Memory (List of Tokens)
  2. Call Stack (List of Stack Entries)
  3. Closure Stack (List of List of Stack Entries)

Generic Memory

Generic memory is split up into two locations. The last 128 tokens in the generic memory space is assigned to be the operational stack of the VM. push, pop, etc, instructions of the VM operate within this memory space.

The rest of the tokens are used in a heap manner. The VM requests memory blocks and are granted based on availability. VM memory addresses directly correspond to the index in this memory space. When the memory is full and the module cannot find empty spaces, a simple mark and sweep garbage collection will occur, clearing up any unused blocks. If even after GC, no available blocks are found, an out of memory error will be thrown.

Call Stack

Each stack entry consists of a identifier bound to an address. The memory module also keeps track of the stack_pointer, which points to the next available call_stack location, and a frame_pointer which points to the header of each frame. The header of each frame is also a stack_entry, with the address being the last frame_pointer as pointed to. Let's imagine we have this code:

let fn => (x) x * 2
fn(10)

The call stack initially looks like this:

INDEX   IDENTIFIER      ADDRESS     STACKPTR    FRAMEPTR
========================================================
0       main            0                       *
1                                   *

After the first line:

INDEX   IDENTIFIER      ADDRESS     STACKPTR    FRAMEPTR
========================================================
0       main            0                       *
1       fn              5           
2                                   *

where 5 refers to the address of the FUNCTION element in generic memory (might not be actually in address 5). After calling fn(10) before returning 10 * 2, the call stack looks like this:

INDEX   IDENTIFIER      ADDRESS     STACKPTR    FRAMEPTR
========================================================
0       main            0                       
1       fn              5           
2       >fn             0                       *           
3       #R/A            10          
4                                   *

where 10 refers to the next instruction after returning from the function, and 0 in >fn refers to the previous frame pointer, which was at 0.

When the function returns, the value of 10 * 2 = 20 will be pushed onto the stack, the instruction pointer will be set to 10, the stack pointer set to the frame pointer, and the frame pointer set to 0.

INDEX   IDENTIFIER      ADDRESS     STACKPTR    FRAMEPTR
========================================================
0       main            0                       *
1       fn              5           
2       >fn             0           *                      
3       #R/A            10          
4                                   

Memory is not cleared upon the return of a function, since future changes to the stack will overwrite existing data.

Closure Stack

The closure stack stores references to lists of stack entries created during the creation of a function. When a function is created, it is stored both with the address to the first instruction, as well as a pointer to one of these lists.

These lists represent the state of all variables excluding globals when the function was created. When the function is called, the variables in the closure are pushed onto the call stack in an "invisible" frame one above the function's operating frame. Therefore, a function can access local variables declared in it's parent scope even after the parent scope has gone out of scope. This operation works very similarly to Javascript Closures. Consider this code:

let makeadd => (adder) #:(addend) adder + addend
let add10 = makeadd(10)
add10(20)

When the function makeadd(10) is called, the local variable adder is assigned the value 10. When the local function #:(addend)... is created, a closure is created containing the mapping adder -> 10 and is stored with the function. Since the function is lazy-evaluated, when the makeadd function returns, the reference to adder is lost. When add10(20) is finally called, upon entering the execution scope of the function, the closure mapping is unwrapped, thereby creating a reference adder -> 10 in the stack frame of the function. This means that returning adder + addend will not throw an error (since without the closure, adder cannot be found).

Since closures are unique to seperate calls to the same parent function, suppose we added this:

let add20 = makeadd(20)
add20(20)

A different closure will be created when the local function #:(addend)... is created for the second time, this time mapping adder -> 20 in the closure. This in turn means that invoking add20(20) will return 40.

Clone this wiki locally