Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions src/CGRANode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -377,13 +377,11 @@ bool CGRANode::canOccupy(DFGNode* t_opt, int t_cycle, int t_II) {
}

bool CGRANode::isOccupied(int t_cycle, int t_II) {
for (int cycle=t_cycle; cycle<m_cycleBoundary; cycle+=t_II) {
for (pair<DFGNode*, int> p: *(m_dfgNodesWithOccupyStatus[cycle])) {
// If DVFS is supported, the entire tile is occupied before the current multi-cycle operation
// completes. Otherwise, the next operation can start before the current one completes.
if (p.second == START_PIPE_OCCUPY or p.second == SINGLE_OCCUPY or m_supportDVFS) {
return true;
}
for (pair<DFGNode*, int> p: *(m_dfgNodesWithOccupyStatus[t_II+(t_cycle)%t_II])){
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this is correct? Removing this would cause only checking one specific cycle. However, we need to check cycle, cycle + II, cycle + 2 * II, cycle + 3 * II, etc. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tancheng, thanks for the review!

You are completely right that modulo scheduling theoretically requires verifying all equivalent cycles.

I double-checked the implementation of setDFGNode and confirmed that m_dfgNodesWithOccupyStatus is only modified in that function. Crucially, the population logic there is strictly periodic:

// In setDFGNode:
for (int cycle = t_cycle % interval; cycle < m_cycleBoundary; cycle += interval) {
    // This loop ensures the occupancy status is identical for ALL modulo cycles.
    m_dfgNodesWithOccupyStatus[cycle]->push_back(...);
}

Since the data is already populated identically for every cycle + k*II, checking a single valid cycle (like t_II + t_cycle % t_II) is mathematically equivalent to checking the entire loop, but much faster.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean there is no need to check cycle + k*II? Could you go through the other kernels' mapping results to ensure the correctness?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then plz leave a comment there, mentioning materializing DFG node mapping across all cycles with II interval has already been done during setDFGNode().

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeowMJ WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeowMJ I have performed a regression test on 5 kernels (fir, conv, nonlinear, multicycle, dvfs) to verify the correctness.

Kernel Baseline II Optimized II Result Status Routing
fir 6 6 Pass Identical
conv 4 4 Pass Identical
nonlinear 2 2 Pass Identical
multicycle 4 4 Pass Identical
dvfs 4 4 Pass Identical

The optimization is strictly bit-exact. For all tested kernels, not only did we achieve the same Initiation Interval (II), but the final Placement and Routing layouts were also identical to the baseline.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tancheng, I have checked the logic of setDFGNode and m_dfgNodesWithOccupyStatus. This change is correct and helpful. We can even update all functions related to setDFGNode and m_dfgNodesWithOccupyStatus to a simpler implementation.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can even update all functions related to setDFGNode and m_dfgNodesWithOccupyStatus to a simpler implementation.

@MeowMJ can you please elaborate on this, and let @shiyunyao try your idea?

// If DVFS is supported, the entire tile is occupied before the current multi-cycle operation
// completes. Otherwise, the next operation can start before the current one completes.
if (p.second == START_PIPE_OCCUPY or p.second == SINGLE_OCCUPY or m_supportDVFS) {
return true;
}
}
return false;
Expand Down