Support Customized DFG Sorting Strategy#146
Conversation
|
|
||
| // Two sorting strategies: pure topological order, or mixed ALAP + topo. | ||
| std::vector<std::pair<Operation *, int>> sorted_ops_with_levels; | ||
| if (sort_strategy_string_ref == "topological") { |
There was a problem hiding this comment.
Can you please remind me what's wrong with "mixed"? You plan to fix "mixed" if no "due to time limit"?
There was a problem hiding this comment.
For this case:
module {
func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", dataflow_mode = "steering"} {
%0 = neura.reserve : i64
%1 = neura.reserve : i64
%2 = neura.reserve : i1
%3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
%4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
%7 = neura.invariant %4, %2 : i64, i1 -> i64
%8 = neura.invariant %3, %2 : i64, i1 -> i64
%9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
%10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
%11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
neura.ctrl_mov %11 -> %2 : i1 i1
%12 = neura.false_steer %9, %11 : i64, i1 -> i64
%13 = "neura.add"(%9, %9) : (i64, i64) -> i64
neura.ctrl_mov %13 -> %0 : i64 i64
%14 = "neura.add"(%10, %7) : (i64, i64) -> i64
neura.ctrl_mov %14 -> %1 : i64 i64
"neura.return"(%12) : (i64) -> ()
}
}
Its topological order is:
[MapToAcceleratorPass] Topologically sorted op: %0 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %1 = neura.reserve : i64
[MapToAcceleratorPass] Topologically sorted op: %2 = neura.reserve : i1
[MapToAcceleratorPass] Topologically sorted op: %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
[MapToAcceleratorPass] Topologically sorted op: %9 = "neura.data_mov"(%3) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %7 = "neura.data_mov"(%4) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %11 = "neura.data_mov"(%5) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %10 = neura.invariant %9, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %8 = neura.invariant %7, %2 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
[MapToAcceleratorPass] Topologically sorted op: %16 = "neura.data_mov"(%10) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %25 = "neura.data_mov"(%8) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %21 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %22 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %18 = "neura.data_mov"(%12) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %24 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %23 = "neura.add"(%21, %22) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %26 = "neura.add"(%24, %25) : (i64, i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %23 -> %0 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %26 -> %1 : i64 i64
[MapToAcceleratorPass] Topologically sorted op: neura.ctrl_mov %17 -> %2 : i1 i1
[MapToAcceleratorPass] Topologically sorted op: %19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] Topologically sorted op: %20 = neura.false_steer %18, %19 : i64, i1 -> i64
[MapToAcceleratorPass] Topologically sorted op: %27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] Topologically sorted op: "neura.return"(%27) : (i64) -> ()
It's mixed sorted order is:
[MapToAcceleratorPass] ALAP Bucket Level 0: 6 ops
%1 = neura.reserve : i64
%2 = neura.reserve : i1
%3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
%6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
%9 = "neura.data_mov"(%3) : (i64) -> i64
%13 = "neura.data_mov"(%6) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 1: 8 ops
%0 = neura.reserve : i64
%5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%11 = "neura.data_mov"(%5) : (i64) -> i64
%10 = neura.invariant %9, %2 : i64, i1 -> i64
%14 = neura.carry %13, %2, %1 : i64, i1, i64 -> i64
%16 = "neura.data_mov"(%10) : (i64) -> i64
%24 = "neura.data_mov"(%14) : (i64) -> i64
%15 = "neura.data_mov"(%14) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 2: 9 ops
%4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
%7 = "neura.data_mov"(%4) : (i64) -> i64
%12 = neura.carry %11, %2, %0 : i64, i1, i64 -> i64
%21 = "neura.data_mov"(%12) : (i64) -> i64
%22 = "neura.data_mov"(%12) : (i64) -> i64
%18 = "neura.data_mov"(%12) : (i64) -> i64
%17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1
neura.ctrl_mov %17 -> %2 : i1 i1
%19 = "neura.data_mov"(%17) : (i1) -> i1
[MapToAcceleratorPass] ALAP Bucket Level 3: 6 ops
%8 = neura.invariant %7, %2 : i64, i1 -> i64
%25 = "neura.data_mov"(%8) : (i64) -> i64
%23 = "neura.add"(%21, %22) : (i64, i64) -> i64
neura.ctrl_mov %23 -> %0 : i64 i64
%20 = neura.false_steer %18, %19 : i64, i1 -> i64
%27 = "neura.data_mov"(%20) : (i64) -> i64
[MapToAcceleratorPass] ALAP Bucket Level 4: 3 ops
%26 = "neura.add"(%24, %25) : (i64, i64) -> i64
neura.ctrl_mov %26 -> %1 : i64 i64
"neura.return"(%27) : (i64) -> ()
Here %8 = neura.invariant %7, %2 : i64, i1 -> i64 is a backward user of %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> : (i64, i64) -> i1, but it's ALAP level is higher than %17. Making it unable to satisfy the producer-consumer dependency check in
https://github.com/coredac/dataflow/blob/81d782e464fcb2a5db7caccede06c9da8a085bbf/lib/NeuraDialect/Mapping/mapping_util.cpp#L845
There was a problem hiding this comment.
Maybe it is not correctly recognized as a critical op? https://github.com/coredac/dataflow/blob/81d782e464fcb2a5db7caccede06c9da8a085bbf/lib/NeuraDialect/Mapping/mapping_util.cpp#L306-L311
File an issue and resolve it later?
Due to the time limit, I just enabled the customized sorting strategy for steering-based dataflow IR mapping.
Maybe we can properly design the mixed sorting later. For now, I just use topologically sorted ops for evaluation.