Skip to content

[NativeAOT-LLVM] "Semi-precise" GC with GCInfo sentinels #3140

@SingleAccretion

Description

@SingleAccretion

There are two main reasons we use conservative GC:

  1. We don't have a cheap mechanism to encode GCInfo (since there is no virtual unwinding).
  2. It makes for smaller code at the expense of GC performance (since we don't need to reload references in LSSA).

2 seems like a sensible tradeoff. Code size is a big priority for WASM. However, for it to work, we don't need truly conservative GC, we just need to report all slots as pinned. So if we found a solution to 1, we could stop relying on the largely untested conservative mode (correctness-wise - the performance characteristics would be the same).

One idea I had for 1, and which this issue is about, is to use the fact address spaces of GC objects and "other things" are disjoint. In this scheme, we would differentiate between "simple" frames, where the shadow stack content is just an array of references (and maybe byrefs, see below), and "complex" frames, where we have some non-GC slots (e. g. for structs with a mix of reference/non-reference fields). Iterating over the shadow stack then would look like this:

void** p = m_pShadowStackTop;
while (p >= m_pShadowStackBottom) {
    // Complex frames allocate the GCInfo at their logical bottom (like the current precise virtual unwinding).
    if IsGCInfo(p) {
        p = ProcessComplexFrame(p);
        continue;
    }

    // Simple frames don't have any GCInfo.
    ReportGCRef(p);
    p--;
}

bool IsGCInfo(void* p) {
    return g_gcInfoLow <= p && p < g_gcInfoHigh;
}

There are some open questions with this scheme:

  1. How can we make it work for dynamic linking, where the GCInfo areas could be discontiguous (and inside the larder "GC allocation" range)?
  2. Can we include byrefs into "simple" frames? It would make byrefs pointing to the GCInfo illegal. Is that something we can consider acceptable (with appropriate testing that the runtime itself never forms such byrefs of course)? It would also make the reporting slower since all refs would be considered byrefs, but that's an expense we already incur.
  3. Are the wins worth it compared to just inserting GCInfo into each frame (the current precise virtual unwinding scheme)? It would depend on how many frames can be "simple".
  4. Is the benefit of not using the conservative mode large enough to pay for it in code size (and complexity), considering performance issues will remain?

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-NativeAOT-LLVMLLVM generation for Native AOT compilation (including Web Assembly)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions