Skip to content

add interpreter dataflow mode#105

Merged
tancheng merged 6 commits intomainfrom
interpreter-dataflow-mode
Aug 21, 2025
Merged

add interpreter dataflow mode#105
tancheng merged 6 commits intomainfrom
interpreter-dataflow-mode

Conversation

@itemkelvin
Copy link
Copy Markdown
Collaborator

@itemkelvin itemkelvin commented Aug 6, 2025

The dataflow execution process (from the provided code context) is as follows:

  1. Setup a dependency graph (value_users) tracking which operations depend on each value.
  2. Initialize a worklist. Operations with no dependencies (e.g., constants) are added to the worklist.
  3. Execute operations in the worklist:
  • Fetch an operation and validate that all its input operands in value_map are valid (via predicate).
  • Execute the operation (e.g., neura.or, neura.sel) using operand values from value_map.
  • Store the result (with combined validity predicate) back into value_map.
  • Add dependent operations (from value_users) to the worklist for subsequent processing.
  1. The loop ends when the worklist is empty.
FAIL: Neura Dialect Tests :: neura/interpreter/lower_and_interpret.mlir (30 of 31)
FAIL: Neura Dialect Tests :: neura/interpreter/lower_and_interpret_subf.mlir (31 of 31)
********************
Unresolved Tests (2):
  Neura Dialect Tests :: neura/interpreter/Output/lower_and_interpret.mlir.tmp-lowered-to-llvm.mlir
  Neura Dialect Tests :: neura/interpreter/Output/lower_and_interpret_subf.mlir.tmp-lowered-to-llvm.mlir

********************
Failed Tests (2):
  Neura Dialect Tests :: neura/interpreter/lower_and_interpret.mlir
  Neura Dialect Tests :: neura/interpreter/lower_and_interpret_subf.mlir


Testing Time: 0.23s

Total Discovered Tests: 31
  Passed    : 27 (87.10%)
  Unresolved:  2 (6.45%)
  Failed    :  2 (6.45%)

1 warning(s) in tests

There are still two test cases that fail, and they weren't written by me.

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Aug 6, 2025

Thanks @itemkelvin for the prototyping~!

  • Plz provide brief explanation about your design in this PR's description. I am most interested in how dataflow is handled different from ctrlflow, what additional stuff is needed, and is it possible to merge some common logic in the code.
  • Plz add comment (descriptive style, third person verb, and end with period, i.e., // Explains sth as comment.) above each newly introduced func.
  • Plz use snake_case for all the variables.

I appreciate the effort~!

@tancheng tancheng added the new feature New feature or request label Aug 6, 2025
@itemkelvin
Copy link
Copy Markdown
Collaborator Author

Thanks @itemkelvin for the prototyping~!

  • Plz provide brief explanation about your design in this PR's description. I am most interested in how dataflow is handled different from ctrlflow, what additional stuff is needed, and is it possible to merge some common logic in the code.
  • Plz add comment (descriptive style, third person verb, and end with period, i.e., // Explains sth as comment.) above each newly introduced func.
  • Plz use snake_case for all the variables.

I appreciate the effort~!

The dataflow execution process (from the provided code context) is as follows:

  1. Setup a dependency graph (value_users) tracking which operations depend on each value.
  2. Initialize a worklist. Operations with no dependencies (e.g., constants) are added to the worklist.
  3. Execute operations in the worklist:
  • Fetch an operation and validate that all its input operands in value_map are valid (via predicate).
  • Execute the operation (e.g., neura.or, neura.sel) using operand values from value_map.
  • Store the result (with combined validity predicate) back into value_map.
  • Add dependent operations (from value_users) to the worklist for subsequent processing.
  1. The loop ends when the worklist is empty.

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Aug 7, 2025

Thanks @itemkelvin for summarization, I put your summary into the PR's description.

Initialize a worklist. Operations with no dependencies (e.g., constants) are added to the worklist.

This sounds a topological sorting first, then put the sorted stuff into a map. So can we reuse

std::vector<Operation *>
mlir::neura::getTopologicallySortedOps(Operation *func_op) {
std::vector<Operation *> sorted_ops;
llvm::DenseMap<Operation *, int> pending_deps;
std::deque<Operation *> ready_queue;
// Collects recurrence cycle ops.
auto recurrence_cycles = collectRecurrenceCycles(func_op);
llvm::DenseSet<Operation *> recurrence_ops;
for (const auto &cycle : recurrence_cycles)
for (Operation *op : cycle.operations)
recurrence_ops.insert(op);
// Counts unresolved dependencies for each op.
func_op->walk([&](Operation *op) {
if (op == func_op) {
return;
}
int dep_count = 0;
for (Value operand : op->getOperands()) {
if (operand.getDefiningOp()) {
++dep_count;
}
}
pending_deps[op] = dep_count;
if (dep_count == 0) {
// TODO: Prioritize recurrence ops. But cause compiled II regression.
// https://github.com/coredac/dataflow/issues/59.
if (recurrence_ops.contains(op)) {
// ready_queue.push_front(op);
ready_queue.push_back(op);
} else {
ready_queue.push_back(op);
}
}
});
// BFS-style topological sort with recurrence priority.
while (!ready_queue.empty()) {
Operation *op = ready_queue.front();
ready_queue.pop_front();
sorted_ops.push_back(op);
for (Value result : op->getResults()) {
for (Operation *user : result.getUsers()) {
if (--pending_deps[user] == 0) {
// TODO: Prioritize recurrence ops. But cause compiled II regression.
// https://github.com/coredac/dataflow/issues/59.
if (recurrence_ops.contains(user)) {
// ready_queue.push_front(user);
ready_queue.push_back(user);
} else {
ready_queue.push_back(user);
}
}
}
}
}
return sorted_ops;
}
i.e., create a util/op_util.cc file, move that getTopologicallySortedOps() into that file, and use it in your code.

value_map & value_users

These two are a little bit confusing. I would suggest rename value_map to value_to_predicated_data_map, and rename value_users to value_to_users_map, how does this sound? We then know what they are supposed to serve/do from the naming.

The loop ends when the worklist is empty.

Can you explain when the worklist would be empty? From you description, "Add dependent operations (from value_users) to the worklist", I didn't see when we skip adding dependent users into the worklist, so sounds like never end.

@itemkelvin
Copy link
Copy Markdown
Collaborator Author

Thanks @itemkelvin for summarization, I put your summary into the PR's description.

Initialize a worklist. Operations with no dependencies (e.g., constants) are added to the worklist.

This sounds a topological sorting first, then put the sorted stuff into a map. So can we reuse

std::vector<Operation *>
mlir::neura::getTopologicallySortedOps(Operation *func_op) {
std::vector<Operation *> sorted_ops;
llvm::DenseMap<Operation *, int> pending_deps;
std::deque<Operation *> ready_queue;
// Collects recurrence cycle ops.
auto recurrence_cycles = collectRecurrenceCycles(func_op);
llvm::DenseSet<Operation *> recurrence_ops;
for (const auto &cycle : recurrence_cycles)
for (Operation *op : cycle.operations)
recurrence_ops.insert(op);
// Counts unresolved dependencies for each op.
func_op->walk([&](Operation *op) {
if (op == func_op) {
return;
}
int dep_count = 0;
for (Value operand : op->getOperands()) {
if (operand.getDefiningOp()) {
++dep_count;
}
}
pending_deps[op] = dep_count;
if (dep_count == 0) {
// TODO: Prioritize recurrence ops. But cause compiled II regression.
// https://github.com/coredac/dataflow/issues/59.
if (recurrence_ops.contains(op)) {
// ready_queue.push_front(op);
ready_queue.push_back(op);
} else {
ready_queue.push_back(op);
}
}
});
// BFS-style topological sort with recurrence priority.
while (!ready_queue.empty()) {
Operation *op = ready_queue.front();
ready_queue.pop_front();
sorted_ops.push_back(op);
for (Value result : op->getResults()) {
for (Operation *user : result.getUsers()) {
if (--pending_deps[user] == 0) {
// TODO: Prioritize recurrence ops. But cause compiled II regression.
// https://github.com/coredac/dataflow/issues/59.
if (recurrence_ops.contains(user)) {
// ready_queue.push_front(user);
ready_queue.push_back(user);
} else {
ready_queue.push_back(user);
}
}
}
}
}
return sorted_ops;
}

i.e., create a util/op_util.cc file, move that getTopologicallySortedOps() into that file, and use it in your code.

value_map & value_users

These two are a little bit confusing. I would suggest rename value_map to value_to_predicated_data_map, and rename value_users to value_to_users_map, how does this sound? We then know what they are supposed to serve/do from the naming.

The loop ends when the worklist is empty.

Can you explain when the worklist would be empty? From you description, "Add dependent operations (from value_users) to the worklist", I didn't see when we skip adding dependent users into the worklist, so sounds like never end.

In the executeOperation function, the interpreter determines whether an operation's result (including both value and predicate) has changed. Only when the result is actually updated (is_updated = 1) will it propagate to downstream users by adding them to the worklist.

For example, when executing ctrl_mov, if the input predicate is true and the target value is updated, the interpreter logs:

[neura-interpreter]  Executing neura.ctrl_mov(dataflow):
[neura-interpreter]  ├─ Source: 1.000000e+00 | 1
[neura-interpreter]  ├─ Target (after): 1.000000e+00 | 1 | is_updated=1
[neura-interpreter]  └─ Execution succeeded
[neura-interpreter]  Operation updated, propagating to users...
[neura-interpreter]  Added user to next work_list: %14 = "neura.phi"(%13, %1) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
[neura-interpreter]  Added user to next work_list: neura.ctrl_mov %18 -> %13 : !neura.data<i64, i1> !neura.data<i64, i1>

Otherwise, if there's no meaningful change, dependent operations are not added to the worklist.

[neura-interpreter]  Executing neura.ctrl_mov(dataflow):
[neura-interpreter]  ├─ Skip update: Source predicate invalid (pred=0)
[neura-interpreter]  ├─ Source: 1.000000e+01 | 0
[neura-interpreter]  ├─ Target (after): 1.000000e+01 | 1 | is_updated=0
[neura-interpreter]  └─ Execution succeeded
[neura-interpreter]  No update for ctrl_mov target: %5 = neura.reserve : !neura.data<i64, i1>

This selective propagation mechanism ensures that only meaningful changes trigger re-execution, which prevents unnecessary work and guarantees that the worklist eventually empties, allowing interpretation to terminate.

Copy link
Copy Markdown
Contributor

@tancheng tancheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM :-)

@itemkelvin
Copy link
Copy Markdown
Collaborator Author

itemkelvin commented Aug 21, 2025 via email

@tancheng
Copy link
Copy Markdown
Contributor

This queue only handles data-flow execution.    Jeromer @.***  

------------------ 原始邮件 ------------------ 发件人: "Cheng @.>; 发送时间: 2025年8月22日(星期五) 凌晨0:00 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [coredac/dataflow] add interpreter dataflow mode (PR #105) @tancheng commented on this pull request. In tools/neura-interpreter/neura-interpreter.cpp: > + if (value_to_predicated_data_map.count(source) && + value_to_predicated_data_map[source].is_updated && + value_to_predicated_data_map[source].predicate && + value_to_predicated_data_map.count(target) && + value_to_predicated_data_map[target].is_updated && + value_to_predicated_data_map[target].predicate) { + affected_values.push_back(target); + } + } + } + + // Adds all users of affected values to the next pending operation queue (if not already present). + for (Value val : affected_values) { + for (Operation* user_op : val.getUsers()) { + if (!is_operation_enqueued[user_op]) { + next_pending_operation_queue.push_back(user_op); We are also leveraging this queue for ctrl-flow execution, right? I am wondering how could it correctly handle if/else or certain ctrl flow, i.e., a operation/value has multiple users, both if/else users would be inserted? or we can correctly only insert the chosen path? Or there will never be above scenario, as ctrl-flow execution would have a br to make user op be identified correctly? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Thanks, plz reply on the pending comment in this PR, so then I can "resolve" them, and merge this PR.

@itemkelvin
Copy link
Copy Markdown
Collaborator Author

This queue only handles data-flow execution.    Jeromer @.***  

------------------ 原始邮件 ------------------ 发件人: "Cheng @.>; 发送时间: 2025年8月22日(星期五) 凌晨0:00 收件人: _@**._>; 抄送: _@.>; @._>; 主题: Re: [coredac/dataflow] add interpreter dataflow mode (PR #105) @tancheng commented on this pull request. In tools/neura-interpreter/neura-interpreter.cpp: > + if (value_to_predicated_data_map.count(source) && + value_to_predicated_data_map[source].is_updated && + value_to_predicated_data_map[source].predicate && + value_to_predicated_data_map.count(target) && + value_to_predicated_data_map[target].is_updated && + value_to_predicated_data_map[target].predicate) { + affected_values.push_back(target); + } + } + } + + // Adds all users of affected values to the next pending operation queue (if not already present). + for (Value val : affected_values) { + for (Operation* user_op : val.getUsers()) { + if (!is_operation_enqueued[user_op]) { + next_pending_operation_queue.push_back(user_op); We are also leveraging this queue for ctrl-flow execution, right? I am wondering how could it correctly handle if/else or certain ctrl flow, i.e., a operation/value has multiple users, both if/else users would be inserted? or we can correctly only insert the chosen path? Or there will never be above scenario, as ctrl-flow execution would have a br to make user op be identified correctly? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: _@_.*>

Thanks, plz reply on the pending comment in this PR, so then I can "resolve" them, and merge this PR.

All pending comments have been replied.

@tancheng tancheng merged commit 7502a45 into main Aug 21, 2025
1 check passed
ShangkunLi pushed a commit that referenced this pull request Mar 12, 2026
ShangkunLi pushed a commit that referenced this pull request Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants