Replies: 1 comment 1 reply
-
yeah, interesting problem that I was expecting we would run into as we tried to do complex analyses! My short term approach would've been wrapping control nodes with an Unstructured CFG might be useful but it represents a trade-off: you might reconstruct loops and conditional instead of having direct access to those structures. Some optimizations, like loop unrolling, become non-trivial with an unstructured representation. I'd be interested in hearing more about the dense array pattern. It represents significant work so I would hesitate to do it before we have multiple use cases. |
Beta Was this translation helpful? Give feedback.
-
At the risk of opening a can of worms, I was thinking about @calebmkim's progress on calculating dominators in Calyx control programs in #1039 and in particular about the representational problems we're facing. I mentioned in #1039 (reply in thread) that one long-term thing to think about is what changes to our IR would make this kind of analysis easier to implement. (This should not be a blocker for the current work, but it got me thinking.)
That thread pondered switching to an unstructured CFG, which may also be a good idea in the long term, but this post is about a different and orthogonal (and maybe easier) change we could make. Namely, something about the current IR design that is complicating the work is the pointer-based structure of
Control
.Control
structs refer to each other using aBox<Control>
, andControl
objects are allocated anywhere they please, which is of course the standard way of representing something tree-shaped.The problem with this (very natural! very normal!) representation for
Control
is that, in Rust, it is surprisingly hard for client code to refer to specificControl
objects. In the current situation, for instance, we want to build up a dominator map—but what should the types in theHashMap<_, _>
be? Certainly notControl
, which would mean taking ownership of a copy of the control statement, and wouldn't even admit comparisons for identity! The two solutions we have MacGyvered are adding an integer attribute to every relevant control statement and using raw pointers. Neither make it possible to get back to theControl
object from the reference if you need it (without scanning over the entire program to find a matchingControl
). We could consider switching to usingRc
everywhere, but that seems like an unfortunate surrender.The idea is to borrow a page from other compiler IRs that pack IR objects into dense arrays and—instead of using pointers directly to the objects—refer to them by integer offsets. Compilers often do this for efficiency (better locality, fewer allocations, smaller reference values), but it has the side benefit of enabling easier references in Rust. Your "reference" is really just an integer, and to "dereference" it you just need access to the entire chunk of values. We would not need to stamp the IR with unique IDs nor use
*const Control
pointers; the canonical way to refer to specific control statements would be through those indices.I even stumbled across a pretty nice implementation of this pattern in Cranelift, in the form of the
cranelift_entity
crate. This crate provides a facility for making newtypes to wrap integers (improving safety/readability over just using integer types as references directly) and offers workalikes forHashMap
and stuff that rely on this pattern.Anyway, not something we should do right now, but could be an interesting refactoring to consider!
Beta Was this translation helpful? Give feedback.
All reactions