Skip to content

[Support] Add SparseOpSCC utility#10305

Open
fzi-hielscher wants to merge 17 commits intollvm:mainfrom
fzi-hielscher:sparcescc-github
Open

[Support] Add SparseOpSCC utility#10305
fzi-hielscher wants to merge 17 commits intollvm:mainfrom
fzi-hielscher:sparcescc-github

Conversation

@fzi-hielscher
Copy link
Copy Markdown
Contributor

@fzi-hielscher fzi-hielscher commented Apr 23, 2026

This PR adds a helper class that collects strongly connected components (SCCs) on a subset of the MLIR operation graph, and allows iterating over them in (reverse) topological order.

I frequently find myself needing to collect a (filtered) graph of operations feeding into one or several given operations to clone or erase them. My ad-hoc implementations are often tedious, error-prone, may have quadratic scaling and/or fail to handle cycles in the graph. So I'm hoping to get this right with a reusable implementation.

The SparseOpSCC class contains a simple implementation of Tarjan's SCC algorithm that can iteratively construct a list of SCCs which are reaching or are reachable by a set of given operations. Its constructor takes an optional filter function as argument, that allows us to exclude certain operations (Update:) edges from the graph. E.g., this can be used to ignore register operations, which in many practical cases will make the remaining graph acyclic. If, and only if, the graph is acyclic, every found SCC will simply be represented by an Operation *. Cyclic SCCs are stored separately as a vector of Operation * wrapped by the CyclicOpSCC helper class.

I initially tried building this around the existing llvm::scc_iterator, but couldn't figure out a "nice" way to get the desired API without having to allocate a wrapper around each Operation *. Let me know if you can think of a way of doing this better, or if I'm missing existing infra that already solves this problem.

AI Disclosure: The code was written with AI assistance. The unit tests are entirely AI generated.

Assisted-by: Claude Code:Sonnet 4.6

@uenoku
Copy link
Copy Markdown
Member

uenoku commented Apr 23, 2026

This looks great! I’ve encountered several situations where I had to write ad hoc SCC implementations, but they were always error-prone, so having a dedicated utility is excellent.

I’m on board with not using scc_iterator. FIRRTL's CheckCombLoop originally used scc_iterator but switched to an ad hoc implementation because the former wasn't a good fit for constructing SCCs for use-def chains.

One thing that would be nice to clarify in the documentation is the tolerance for operation mutation. I think it's reasonable to ban mutation while traversing the SCC and require users to store the SCC results elsewhere if they need to modify them.

I also appreciate the terminology consistency with MLIR's topological sort utils. I believe the filter is similar to isOperationReady, as those operations can always be scheduled.

@fzi-hielscher
Copy link
Copy Markdown
Contributor Author

Thans a lot, @uenoku. I'm glad to hear I'm not the only one who has been struggling with this. 😄

One thing that would be nice to clarify in the documentation is the tolerance for operation mutation. I think it's reasonable to ban mutation while traversing the SCC and require users to store the SCC results elsewhere if they need to modify them.

I've added it to the documentation. The SparseOpSCC class actually already stores all the results, so mutating the IR while iterating over the SCCs is safe. For the sake of simplicity I've stuffed everything into class members, which of course makes its instances pretty heavyweight. Technically, we could discard a lot of the internal state after all calls to visit have been done.

I also appreciate the terminology consistency with MLIR's topological sort utils. I believe the filter is similar to isOperationReady, as those operations can always be scheduled.

Thanks for pointing out the TopologicalSortUtils, I had not seen them yet. I've changed the filter so it now can be applied to individual edges, not just nodes/operations. However, I'm pedantically hesitant to call itisOperandReady. The result of that function currently works "the other way around" and the name wouldn't be fitting for reverse traversal. But I'm admittedly struggling to design that callback in a way that is consistent in both directions and not too confusing.

@uenoku
Copy link
Copy Markdown
Member

uenoku commented Apr 25, 2026

I've added it to the documentation. The SparseOpSCC class actually already stores all the results, so mutating the IR while iterating over the SCCs is safe. For the sake of simplicity I've stuffed everything into class members, which of course makes its instances pretty heavyweight. Technically, we could discard a lot of the internal state after all calls to visit have been done.

Interesting, so lowLink etc is not accessed after SCC was constructed?

Comment thread include/circt/Support/SparseOpSCC.h
Comment thread include/circt/Support/SparseOpSCC.h Outdated
// The SparseOpSCC class internally stores the result of the SCC analysis
// and is only updated when visit(...) is called. It is not recommended
// to mutate the IR between visit calls. Calling visit invalidates all
// iterators. It is safe to mutate the IR while iterating. To reflect the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's nice to clarify what kind of mutation is safe here. Since the SCC result stores raw Operation *s, mutating operands/users after analysis seems fine, but erasing operations that may still appear in later SCC entries would leave dangling pointers?

Comment thread include/circt/Support/SparseOpSCC.h Outdated
@fzi-hielscher
Copy link
Copy Markdown
Contributor Author

fzi-hielscher commented Apr 25, 2026

I've added it to the documentation. The SparseOpSCC class actually already stores all the results, so mutating the IR while iterating over the SCCs is safe. For the sake of simplicity I've stuffed everything into class members, which of course makes its instances pretty heavyweight. Technically, we could discard a lot of the internal state after all calls to visit have been done.

Interesting, so lowLink etc is not accessed after SCC was constructed?

Yes. We only really need the lowLink and index values within an individual visit call. The established reverse topological order remains stable after it has been added to the sccs vector. And in-between visits it would be sufficient to retain the set of visited nodes.

}
}

OpSCCFilter shouldTraverseFn;
Copy link
Copy Markdown
Member

@uenoku uenoku Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be std::function instead of functional_ref? I feel unit tests are avoiding lifetime issue by creating a temporary variable explicitly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Again, good catch. The filter was just a function argument and not owned by the class in an earlier version.


namespace detail {
/// Backing storage for a cyclic SCC (implementation detail).
using CyclicOpSCCStorage = llvm::SmallVector<mlir::Operation *, 4>;
Copy link
Copy Markdown
Contributor Author

@fzi-hielscher fzi-hielscher Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an API perspective a set might make more sense than a vector here. The operations are unique by construction and have no specified order. But it would push the burden of maintaining a deterministic iteration order onto the user. So... SmallSetVector, or let the users construct the set themselves?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me a vector seems better since it has a deterministic order? I think the majority of the use case is to iterate on operations on SCC. I think we can also provide "does this operation belong to the same SCC" API with O(1) by looking at lowLink?

@fzi-hielscher
Copy link
Copy Markdown
Contributor Author

Thanks for your feedback, @uenoku. I made some changes to the layout of the SparseOpSCC class. The temporary state of the Tarjan algorithm is now stored only in local variables. Instead, there is a member that maps all discovered operations to their SCC to enable a direct O(1) look-up of the SCC without exposing implementation details. I also removed the hasSelfLoop helper and instead detect self-loops as part of the DFS. This has the benefit of invoking the shouldTraverseFn only once per discovered edge.

I did a "self"-review pass over the AI generated unit tests. They look plausible to me and should provide a reasonable amount of coverage.

@fzi-hielscher fzi-hielscher marked this pull request as ready for review April 30, 2026 11:40
Copy link
Copy Markdown
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I forgot to push review button, sorry for the delay! i

I wonder if we can keep OpSSC as internal data representation and change to expose some wrapper struct, but otherwise looks looks good to me.

OpSCC>(it),
cyclicSccs(cyclicSccs) {}

const llvm::SmallVectorImpl<CyclicOpSCCStorage> &cyclicSccs;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ArrayRef work?

/// Note: void * must be placed first in the union so that the all-zero
/// (default-constructed) state identifies unambiguously as invalid, not as a
/// null Operation*.
using OpSCC = llvm::PointerUnion<void *, mlir::Operation *, CyclicOpSCC>;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried about this PointerUnion is exposed to an user. I wonder if it's clear to have a wrapper struct for user facing API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants