Skip to content
Philip Herron edited this page Feb 22, 2021 · 6 revisions

Name resolution

Consider the following simple rust program:

fn test(a :i32) -> i32 {
   a + 1
}

fn main() { 
    let a;
    a = 1;
    
    let b;
    b = a;
    
    let c = test(b);
    
    let a;
    a = 1f32;
}

When this gets compiled there are 2 passes within the resolved code, top level and late. Toplevel is for scanning all of the top-level items into the relevant ROOT ribs. This allows for the lookup of functions declared after their use in this case if the test was declared after main. Late is all about delving into structs and functions and resolving all the types and names used in the blocks recursively.

In rust we use NodeIds thought the AST allows for side table lookups for every node within the IR. This is important as it keeps the AST immutable and forces API's to be designed to make this simpler. Usually in compilers name resolution is done via a stack of contexts with maps of strings. This is not suitable for the type of system that rust requires.

When we assign ids and have lookup tables everything becomes reusable at all stages of the compiler and moves towards a more query based compilation system (GCCRS is not fully there yet).

Resolving names and IDs

If you follow the above pice of code it is lexically very simple to follow so lets break this down into what the compiler see's this as:

`

           <b id=11>
           <b ref=11 id=12>
           <a ref=9 id=13>
       
           <c id=14>
           <test ref=1 id=15>
       
           <a id=16> // shadowing just becomes a new id
           <a ref=16 id=17>

`

Here you can see each block is broken down into an appropriate level of nesting for each block, each declaration be it a function, parameter or var decl has an ID in the real code everything has an ID but ignore that for now. Note the name resolution dump is not implemented yet and is an open GSOC project.

When these mappings are built up we end up with a bunch of side table maps and lookups such that each rib can be looked up at any time.

`root-rib id=ROOT_RIB_ID { decls: [, ] references: [ <id=1>: [id=15], <id=6>: [] ]

test-rib id=1 {
    test-params-rib {
        decls: [<param a id=3>],
        references: [
            <id=3>: [id=5]
        ]
        
        test-block-rib {
            decls: [],
            references: []
        }
    }
}

main-rib id=6 {
    main-param-rib {
        decls: [],
        references: []
        
        main-block-rib {
            decls: [<var a id=9>, <var b id=11>, <var c id=14>, <var a id=16>]
            references:[
                <id=9>: [id=10, id=13],
                <id=11>: [id=12],
                <id=14>: [],
                <id=16>: [id=17]
            ]
        }
    }
}

} `

The important piece here is that for every BlockExpr on a function for example compilers need to know all of the locals associated with this block in order for it to be able to build the correct stack frame information. https://en.wikipedia.org/wiki/Call_stack