[Support] Add SparseOpSCC utility#10305
Conversation
c67ccd7 to
b5ac17b
Compare
|
This looks great! I’ve encountered several situations where I had to write ad hoc SCC implementations, but they were always error-prone, so having a dedicated utility is excellent. I’m on board with not using scc_iterator. FIRRTL's CheckCombLoop originally used scc_iterator but switched to an ad hoc implementation because the former wasn't a good fit for constructing SCCs for use-def chains. One thing that would be nice to clarify in the documentation is the tolerance for operation mutation. I think it's reasonable to ban mutation while traversing the SCC and require users to store the SCC results elsewhere if they need to modify them. I also appreciate the terminology consistency with MLIR's topological sort utils. I believe the filter is similar to isOperationReady, as those operations can always be scheduled. |
|
Thans a lot, @uenoku. I'm glad to hear I'm not the only one who has been struggling with this. 😄
I've added it to the documentation. The
Thanks for pointing out the TopologicalSortUtils, I had not seen them yet. I've changed the filter so it now can be applied to individual edges, not just nodes/operations. However, I'm pedantically hesitant to call it |
Interesting, so lowLink etc is not accessed after SCC was constructed? |
| // The SparseOpSCC class internally stores the result of the SCC analysis | ||
| // and is only updated when visit(...) is called. It is not recommended | ||
| // to mutate the IR between visit calls. Calling visit invalidates all | ||
| // iterators. It is safe to mutate the IR while iterating. To reflect the |
There was a problem hiding this comment.
Maybe it's nice to clarify what kind of mutation is safe here. Since the SCC result stores raw Operation *s, mutating operands/users after analysis seems fine, but erasing operations that may still appear in later SCC entries would leave dangling pointers?
Yes. We only really need the lowLink and index values within an individual |
| } | ||
| } | ||
|
|
||
| OpSCCFilter shouldTraverseFn; |
There was a problem hiding this comment.
Should this be std::function instead of functional_ref? I feel unit tests are avoiding lifetime issue by creating a temporary variable explicitly.
There was a problem hiding this comment.
Indeed. Again, good catch. The filter was just a function argument and not owned by the class in an earlier version.
|
|
||
| namespace detail { | ||
| /// Backing storage for a cyclic SCC (implementation detail). | ||
| using CyclicOpSCCStorage = llvm::SmallVector<mlir::Operation *, 4>; |
There was a problem hiding this comment.
From an API perspective a set might make more sense than a vector here. The operations are unique by construction and have no specified order. But it would push the burden of maintaining a deterministic iteration order onto the user. So... SmallSetVector, or let the users construct the set themselves?
There was a problem hiding this comment.
For me a vector seems better since it has a deterministic order? I think the majority of the use case is to iterate on operations on SCC. I think we can also provide "does this operation belong to the same SCC" API with O(1) by looking at lowLink?
b248c3b to
edf42a8
Compare
|
Thanks for your feedback, @uenoku. I made some changes to the layout of the I did a "self"-review pass over the AI generated unit tests. They look plausible to me and should provide a reasonable amount of coverage. |
uenoku
left a comment
There was a problem hiding this comment.
Looks like I forgot to push review button, sorry for the delay! i
I wonder if we can keep OpSSC as internal data representation and change to expose some wrapper struct, but otherwise looks looks good to me.
| OpSCC>(it), | ||
| cyclicSccs(cyclicSccs) {} | ||
|
|
||
| const llvm::SmallVectorImpl<CyclicOpSCCStorage> &cyclicSccs; |
| /// Note: void * must be placed first in the union so that the all-zero | ||
| /// (default-constructed) state identifies unambiguously as invalid, not as a | ||
| /// null Operation*. | ||
| using OpSCC = llvm::PointerUnion<void *, mlir::Operation *, CyclicOpSCC>; |
There was a problem hiding this comment.
I'm slightly worried about this PointerUnion is exposed to an user. I wonder if it's clear to have a wrapper struct for user facing API.
This PR adds a helper class that collects strongly connected components (SCCs) on a subset of the MLIR operation graph, and allows iterating over them in (reverse) topological order.
I frequently find myself needing to collect a (filtered) graph of operations feeding into one or several given operations to clone or erase them. My ad-hoc implementations are often tedious, error-prone, may have quadratic scaling and/or fail to handle cycles in the graph. So I'm hoping to get this right with a reusable implementation.
The
SparseOpSCCclass contains a simple implementation of Tarjan's SCC algorithm that can iteratively construct a list of SCCs which are reaching or are reachable by a set of given operations. Its constructor takes an optional filter function as argument, that allows us to exclude certainoperations(Update:) edges from the graph. E.g., this can be used to ignore register operations, which in many practical cases will make the remaining graph acyclic. If, and only if, the graph is acyclic, every found SCC will simply be represented by anOperation *. Cyclic SCCs are stored separately as a vector ofOperation *wrapped by theCyclicOpSCChelper class.I initially tried building this around the existing
llvm::scc_iterator, but couldn't figure out a "nice" way to get the desired API without having to allocate a wrapper around eachOperation *. Let me know if you can think of a way of doing this better, or if I'm missing existing infra that already solves this problem.AI Disclosure: The code was written with AI assistance. The unit tests are entirely AI generated.
Assisted-by: Claude Code:Sonnet 4.6