Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing a (likely) bug in MapFusion. #1673

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

pratyai
Copy link
Collaborator

@pratyai pratyai commented Oct 4, 2024

When checking for the "array usage" criteria that can prevent map-fusion, check only within the current state. Otherwise, any "use" of the array globally (i.e., in the entire SDFG) will cancel the fusion.

@tbennun
Copy link
Collaborator

tbennun commented Oct 5, 2024

This should not be the correct behavior. Eliminating an access node that corresponds to a data container has to be done safely: if a data container is used in another state (via an access node with the same name), its memory must be preserved because it can be reused in a subsequent read (which is a case that happens often when conserving memory). If you want map fusion to work, you would need to use a unique array name that can be removed, or rename arrays that you know they will be overwritten later.

Checking the SDFG for other instances of the access node is the correct way to check for a safe removal of an access node:

>>> sdfg = dace.SDFG('test')
>>> state = sdfg.add_state()
>>> state2 = sdfg.add_state_after(state)
>>> state.add_access('A')
>>> state2.add_access('A')
# Checking access nodes:
>>> state.data_nodes()
[AccessNode (A)]
>>> sdfg.data_nodes()
[AccessNode (A), AccessNode (A)]

@pratyai
Copy link
Collaborator Author

pratyai commented Oct 5, 2024

I see that my test case (of applying fusion twice) has this problem: the corresponding transient access nodes in two copies have the same name. I was assuming that the transient access nodes are scoped only within the state, and states cannot rely on other states' transient access nodes.
Just to confirm, must the transient access nodes have globally unique names?

Even then, I think that MapFusion's current way sometimes unnecessarily chooses to avoid the fusion. Could we just not remove such access nodes and just proceed with the fusion where both versions would produce identical output? In other words, what if we allow the fusion without any access-node-removal, and then remove any redundant transient nodes?

[ By the way, I've been looking at this since only very recently, so chances are that I have some misunderstanding about access nodes. Apologies in advance, if that's indeed the case here. ]

@pratyai
Copy link
Collaborator Author

pratyai commented Oct 7, 2024

I've come up with the following example to demonstrate. I want the MapFusion to work on the following graph:

def f4():  
    g = dace.SDFG('f4')  
    g.add_array('A', [10], dace.float32)  
    g.add_array('B', [10], dace.float32)  
  
    st0 = g.add_state('st0')  
    # First map  
    wA_e, wA_x = st0.add_map('st0_wA', {'i': f"0:10"})  
    op0 = st0.add_tasklet('set_1', {}, {'Ai_out'}, 'Ai_out = 1')  
    st0.add_memlet_path(wA_e, op0, memlet=dace.Memlet())  
    wA = st0.add_write('A')  
    st0.add_memlet_path(op0, wA_x, wA, src_conn='Ai_out', memlet=dace.Memlet(data='A', subset='i'))  
    # Second map  
    wB_e, wB_x = st0.add_map('st0_wB', {'i': f"0:10"})  
    op1 = st0.add_tasklet('set_B', {'Ai_in'}, {'Bi_out'}, 'Bi_out = Ai_in')  
    st0.add_memlet_path(wA, wB_e, op1, dst_conn='Ai_in', memlet=dace.Memlet(data='A', subset='i'))  
    wB = st0.add_write('B')  
    st0.add_memlet_path(op1, wB_x, wB, src_conn='Bi_out', memlet=dace.Memlet(data='B', subset='i'))  
  
    st1 = g.add_state_after(st0, 'st1')  
    # First map  
    wA_e, wA_x = st1.add_map('st1_wA', {'i': f"0:10"})  
    op2 = st1.add_tasklet('set_2', {}, {'Ai_out'}, 'Ai_out = 2')  
    st1.add_memlet_path(wA_e, op2, memlet=dace.Memlet())  
    wA = st1.add_write('A')  
    st1.add_memlet_path(op2, wA_x, wA, src_conn='Ai_out', memlet=dace.Memlet(data='A', subset='i'))  
    g.validate()  
    return g

Which produces this SDFG: fusion_2state_before

This SDFG currently does not allow fusion inside state#1, since state#2 also contains an access to A. However, I would like to transform it to: fusion_2state_after_fix

I claim that:

  1. Fusion should not remove the access node A from state#1.
  2. State#2 can only rely on the data in A after state#1 has completed in entirety. So, the fusion should still produce the same data in A and B at the end of state#1, but should be allowed to transform computation within it.

@tbennun
Copy link
Collaborator

tbennun commented Oct 10, 2024

@philip-paul-mueller related to what you are working on?

@philip-paul-mueller
Copy link
Collaborator

Yes. This problem should be solved by the new map fusion.

@philip-paul-mueller
Copy link
Collaborator

The error you have looks familiar, I saw it when I rewrote MapFusion.
Back then I had the feeling that subgraph fusion, the component that fails, highly depends on how the graph produced by MapFusion looks.
My solution was to tweak MapFusion in such a way to avoid some fusion operation during auto optimization.
This is not a nice solution, but it worked.

@pratyai
I would also advice you to look at my PR that reimplements MapFusion.
Since I have currently other, more urgent, stuff to do you are more than welcomed to work on it.
If you are interested contact me in Mattermost.

@pratyai
Copy link
Collaborator Author

pratyai commented Oct 16, 2024

@philip-paul-mueller while I generally like your PR at a high level (including the choice of two versions of MapFusion), I don't think right now I have sufficient expertise of DaCe to convincingly push this change myself -- especially since some architecture-level concerns (e.g. two map fusions) are not entirely resolved, as far as I can tell. Perhaps in a few weeks I would be able to do so (hopefully).

@pratyai pratyai force-pushed the fix-map-fusion-two-state branch 3 times, most recently from c98af78 to 19525ef Compare October 16, 2024 13:54
like:
```c++
cpp.reshape_strides(Range([(0, 4, 1), (0, 5, 1)]), None, None, [2, 3, 5])
```
and crashes with an index error.
outside the loop. It seems to be a typo on `iedge`.
need to be tied to the class itself. Remove their unnecessary arguments.
map-fusion, check only within the current state. Otherwise, any "use" of
the array _globally_ (i.e., in the entire SDFG) will cancel the fusion.
@pratyai pratyai added the no-ci Do not run any CI or actions for this PR label Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-ci Do not run any CI or actions for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants