Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphs: fix isCyclic Digraph #3628

Merged
merged 2 commits into from
Mar 26, 2025

Conversation

taboege
Copy link
Contributor

@taboege taboege commented Jan 17, 2025

This fixes the detection of cycles in directed graphs (see issue #3626) by not relying on the numbers from DFS but by attempting a topological sort. It succeeds if and only if the graph is acyclic. This reuses existing code as much as possible.

Also added tests that document previous bugs which are now fixed.

This fixes the detection of cycles in directed graphs (see issue Macaulay2#3626) by
not relying on the numbers from DFS but by attempting a topological sort.
It succeeds if and only if the graph is acyclic. This reuses existing code
as much as possible.

Also added tests that document previous bugs which are now fixed.
@d-torrance
Copy link
Member

This change broke a test in the StatGraphs package:

/Users/runner/work/M2/M2/M2/Macaulay2/packages/StatGraphs.m2:1517:5-1523:3 error:
 --                 Graph => Graph{1 => {2, 4, 5}}
 --                                2 => {1, 3}
 --                                3 => {2, 4}
 --                                4 => {1, 3}
 --                                5 => {1}
 -- 
 -- o4 : MixedGraph
 -- 
 -- i5 : 	   assert(not isCyclic G)    
 -- stdio:6:17:(3): error: assertion failed
 -- 
stdio:1:5:(3): error: test(s) #13 of package StatGraphs failed.

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Yes, I have been looking at that but I'm confused on multiple levels.

(1) When I step through the code below with the failing test no. 13:

for i from 0 to n - 1 do
(
for j from 0 to n - 1 do
(if not submatrix(adjacencyMatrix D, toList(set(allComp_i)*set(vertices D)), toList(set(allComp_j)*set(vertices D)))==0 then adjMG_(i,j)=1 else adjMG_(i,j)=0
));

A 7x7 adjacencyMatrix D is indexed using node labels of the graph as indices, but the node labels are {1, 2, 3, 4, 5, 7, 8} which triggers an array out of bounds error that I don't see when just calling check(StatGraphs).

(2) When I fix the matrix indexing issue and continue stepping through the code, it runs through and produces a result which contradicts the assertion in the test. The isCyclic method in StatGraphs in this case constructs the following matrix:

1 1 1 1
1 1 1 1
0 0 0 0
0 0 0 0

As an adjacency matrix of a directed graph, it exhibits many cycles. Even before my commit, isCyclic Digraph would detect that this is cyclic. So I don't know what causes the result to differ.

(3) For all I know, the error could be in my commit, the StatGraphs code or the StatGraphs assertion. I would have to locate the definition of cycle for mixed graphs to narrow it down.

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Most of my confusion cleared right up after commenting. I overlooked this crucial line which causes the indexing problems not to appear:

G:=indexLabelGraph g;

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Now I understand how the test result differs. On test no. 13, the isCyclic in StatGraphs constructs the following adjacency matrix of a directed graph:

1 1 0 0
0 0 0 0
0 0 0 1
0 0 0 0

Up until my commit, this graph was reported as being acyclic. My change made it cyclic. I would argue that the self-loop at the first node is a cycle (meaning that either the StatGraphs code or the StatGraphs test assertion is wrong). But I have to do some reading to see what the definition of cycles in this class of graphs is.

@taboege
Copy link
Contributor Author

taboege commented Jan 27, 2025

I have heard from one of the developers of StatGraphs that their main source is the paper of Lauritzen and Sadeghi cited prominently in the documentation. With this definition, I believe that the assertion of the failing test (which is also the first example in the documentation of isCyclic) is wrong. The mixed graph has undirected edges 1 -- 2 -- 3 and a directed edge 3 -> 1 back. According to page 4 of Lauritzen–Sadeghi, this is a cycle.

On a tangent, I am also not sure if the algorithm that decides isCyclic MixedGraph is correct. At least it does not match its documentation:

A directed cycle is a cycle in the Digraph constructed from a mixed graph G by identifying all connected components on bidirected and undirected edges. Such a connected component contains either bidirected edges only or undirected edges only.

What the implementation does is use the connected components of the bidirected and the undirected parts. But these components might still have edges of other types between them. Or maybe I'm misunderstanding the sentence?

It would be most helpful if one of the developers of the package could comment on the definition of cyclic being used and the intent of the code @lukeamendola @luisgarciapuente @olgakuznetsova @harshitmotwani2015

@olgakuznetsova
Copy link
Contributor

olgakuznetsova commented Jan 28, 2025 via email

@taboege
Copy link
Contributor Author

taboege commented Jan 28, 2025

As for the other question, we ignore all edges that are not directed and only check for cycles created by the directed edges.

I don't understand. Is the purpose of isCyclic to check if the directed part of the mixed graph has a directed cycle, ignoring the bidirected and undirected parts completely? This is not the definition of cycle according to the mixed graphs paper. (But it would be compatible with the definition of cycles for less complex mixed graphs such as those modeling latent variables).

However, this is not what the code does, not even close to what it tries to do. The code identifies vertices that have (separately) undirected or bidirected edges between them and then considers directed edges between those classes. This part is in line with the documentation:

In the next example, there are no cycles inside the digraph of the mixed graph, but there is a directed cycle after you identify the vertices {1,2} and {3,4}.

Lastly, I want to note that the current implementation of isCyclic in StatGraphs uses an algorithm which does not always produce the correct result in the sense of Lauritzen and Sadeghi. I haven't found this algorithm anywhere in the literature and why it is correct (or which definition of "cycle" it relies on). Here is an example:

needsPackage "StatGraphs";
U = graph({2,3},{{2,3}});
D = digraph({1,2},{{1,2}});
B = bigraph({1,3},{{1,3}});
G = mixedGraph(U,D,B);
print isCyclic G --> false

The algorithm computes an auxiliary directed graph on the connected components of U and B, so the edge set of that auxiliary graph is {{1,3}, {2,3}} and those two nodes have a single directed edge between them since 1 -> 2 is a directed edge in the original graph. This digraph is not cyclic. However, the original mixed graph is cyclic according to Lauritzen and Sadeghi.

I'm still puzzled by what the intent behind isCyclic should be. If it should only check cyclicity of the digraph part, then both its implementation and its documentation are way off (and that is why test no. 13 is failing because of my change to the Graphs package). If it should check Lauritzen and Sadeghi's definition, then its implementation is wrong and the test assertion is wrong. Both can be fixed easily, I just don't know which behavior is intended.

@olgakuznetsova
Copy link
Contributor

olgakuznetsova commented Jan 28, 2025 via email

@tom111
Copy link
Contributor

tom111 commented Mar 26, 2025

Hello everyone. I've reviewed Tobias' change to the Graphs package and I think they are correct and it is crucial to fix issue #3626. In the issue there is an example of a clearly cyclic graph that is detected as a DAG which it is not.

The issue with the failing test in the StatGraphs package seems to boils down to deciding, if we want a directed graph with one vertex and a single directed edge from that vertex to itself be cyclic or not. In my opinion that graph is cyclic and the Graphs package should return that.

The most fundamental mathematical structure here is the DAG. A DAG cannot have directed self-loop. I think this is accepted in the terminology and widely used.

If we make this change, then the StatGraphs test now fails because it asserts that a graph with such a loop is acyclic which is not true. Whatever the StatGraphs package is doing, it would need to be fixed to take that into account. The algorithm in

isCyclic MixedGraph := Boolean => g -> (
does something non-obvious that appears only in a few papers while the notion of DAG appears very widely. We should have the correct notion of DAG in M2.

I vote for merging this and disabling the failing test for now.

Pending decision about the behavior of isCyclic(MixedGraph)
@d-torrance
Copy link
Member

Sounds good to me. I just pushed a commit skipping the affected test for now.

@d-torrance d-torrance merged commit b9c5fff into Macaulay2:development Mar 26, 2025
5 checks passed
@d-torrance
Copy link
Member

I've opened #3712 to keep track of the remaining issue with isCyclic MixedGraph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants