Fix canReachLocInTime and Customize backtrack-config to Reduce II #195

guosran · 2025-11-25T13:01:28Z

Summary

Reduced compiled II for the nested_loop mapping from 17 to 11.

Key change

Fixed the reachability pre-check so it accounts for register-based waiting. This aligns canReachLocInTime with the actual routing logic used by tryRouteDataMove.
Changed backtrack-config to customized (5,3).

Results:

nested_loop mapping: The first change reduced compiled_II from 17 → 11 (rec_mii=9, res_mii=6). Using backtrack-config=customized (5,3) further reduced it to 11.
(5,3) is a good compromise. Larger backtrack settings increase compile time without further lowering II on this benchmark.

ShangkunLi · 2025-11-26T00:48:09Z

Modifying this logic is supposed to improve all the other mapping results, but why did you only change the mapping result of a nested loop? This may not pass the llvm-lit test.

guosran · 2025-11-26T02:59:36Z

Modifying this logic is supposed to improve all the other mapping results, but why did you only change the mapping result of a nested loop? This may not pass the llvm-lit test.

Done fixing.

lib/NeuraDialect/Mapping/mapping_util.cpp

guosran · 2025-11-27T04:22:20Z

test/controflow_fuse/perfect_nested/perfect_nested.mlir

 // CTRL2DATA-NEXT:   }

-// MAPPING:      func.func @_Z10bert_node1PA1_A1_A1_A1_A128_bPA1_A128_S1_(%arg0: memref<?x1x1x1x1x128xi8>, %arg1: memref<?x1x128x1x1x128xi8>) attributes {accelerator = "neura", dataflow_mode = "predicate", llvm.linkage = #llvm.linkage<external>, mapping_info = {compiled_ii = 10 : i32, mapping_mode = "spatial-temporal", mapping_strategy = "heuristic", rec_mii = 8 : i32, res_mii = 2 : i32, x_tiles = 4 : i32, y_tiles = 4 : i32}} {
+// MAPPING:      func.func @_Z10bert_node1PA1_A1_A1_A1_A128_bPA1_A128_S1_(%arg0: memref<?x1x1x1x1x128xi8>, %arg1: memref<?x1x128x1x1x128xi8>) attributes {accelerator = "neura", dataflow_mode = "predicate", llvm.linkage = #llvm.linkage<external>, mapping_info = {compiled_ii = 11 : i32, mapping_mode = "spatial-temporal", mapping_strategy = "heuristic", rec_mii = 8 : i32, res_mii = 2 : i32, x_tiles = 4 : i32, y_tiles = 4 : i32}} {


The compiled II is abnormally larger, I will check the reasons.

tancheng · 2025-11-27T14:14:11Z

lib/NeuraDialect/Mapping/mapping_util.cpp


-      // Makes sure the link is not occupied.
-      if (!mapping_state.isAvailableAcrossTime(current_loc_out_link)) {
+      // Check if link is available at current time step.


// Check -> // Checks

tancheng · 2025-12-16T16:47:39Z

Any update on this PR?

Copilot

Pull request overview

This PR improves the spatial-temporal mapping by fixing a reachability pre-check bug and adjusting backtracking parameters. The core fix ensures canReachLocInTime properly accounts for register-based waiting, aligning it with the actual routing logic. The backtrack-config is changed from "simple" to "customized" (5,3 by default).

Fixed canReachLocInTime to use BFS with register-based waiting support
Updated backtrack-config from "simple" to "customized" in test configurations
Updated test expectations to reflect improved compiled II values across multiple benchmarks

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
lib/NeuraDialect/Mapping/mapping_util.cpp	Refactored `canReachLocInTime` BFS to consider both link traversal and register-based waiting, matching the logic in `tryRouteDataMove`
test/c2llvm2mlir/nested_loop/test.mlir	Changed backtrack-config to customized; updated expected compiled_ii from 17 to 11
test/controflow_fuse/perfect_nested/perfect_nested.mlir	Changed backtrack-config to customized; updated expected compiled_ii from 10 to 11
test/mapping_quality/branch_for.mlir	Updated test comments and YAML/ASM checks for structural stability; updated expected compiled_ii from 5 to 4
test/neura/fusion/test.mlir	Updated expected compiled_ii from 14 to 13
test/neura/for_loop/relu_test.mlir	Updated expected mapping locations to reflect improved routing paths
test/e2e/bicg/bicg_kernel.mlir	Updated expected register assignments and instruction timestamps in YAML and ASM checks
.gitignore	Removed trailing "# misc" comment line

Comments suppressed due to low confidence (1)

test/controflow_fuse/perfect_nested/perfect_nested.mlir:197

The compiled_ii increased from 10 to 11 in this test, which appears to be a regression rather than an improvement. The PR description mentions that the nested_loop mapping improved from 17 to 11, but this perfect_nested test shows the opposite direction. Please verify if this regression is expected or if there's an issue with the changes. If this is an expected tradeoff, it should be documented in the PR description.

// MAPPING:      func.func @_Z10bert_node1PA1_A1_A1_A1_A128_bPA1_A128_S1_(%arg0: memref<?x1x1x1x1x128xi8>, %arg1: memref<?x1x128x1x1x128xi8>) attributes {accelerator = "neura", dataflow_mode = "predicate", llvm.linkage = #llvm.linkage<external>, mapping_info = {compiled_ii = 10 : i32, mapping_mode = "spatial-temporal", mapping_strategy = "heuristic", rec_mii = 8 : i32, res_mii = 2 : i32, x_tiles = 4 : i32, y_tiles = 4 : i32}} {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test/c2llvm2mlir/nested_loop/test.mlir

lib/NeuraDialect/Mapping/mapping_util.cpp

guosran · 2025-12-18T04:26:07Z

Any update on this PR?

Just updated the calculateAward function to resolve the unexpected compiled_ii increment.

guosran · 2025-12-18T04:31:48Z

Identified some minor problems to be fixed.

tancheng · 2025-12-18T04:35:50Z

lib/NeuraDialect/Mapping/mapping_util.cpp

    }
-    int award = 2 * mapping_state.getII();
+
+    // === Tile-based award (independent of time) ===


Can you summarize what you changed and the corresponding effect? no need to provide concrete example, but just description.

Redesign the award calculation algorithm to fix potential negative award
values and improve mapping quality.

Changes:

Split award into tile-based and time-based components

Add proximity bonus to producers (always non-negative)

Add proximity bonus to backward users for better recurrence routing

Grant critical ops additional II bonus and routing flexibility bonus

Use time_bonus = latest_end_time_step - t for earlier scheduling

Now I am working with different sets of parameters for the award formula, and will update later.

How is latest_end_time_step determined?

I am working with different sets of parameters for the award formula

Can we do this in another PR? Or it has to be done here?

You may refer to the following code:

int latest_end_time_step = earliest_start_time_step + mapping_state.getII(); ... for (Operation *user : backward_users) { ... latest_end_time_step = std::min(latest_end_time_step, backward_user_loc.time_step + mapping_state.getII()); ... }

I have updated the new changes in PR #215 .

guosran · 2025-12-18T14:53:11Z

Progress: Update the award function with register allocation consideration; regenerate test files with further improved II.
Will finish updating the test files tmr.

…ck-config to 'customized'

guosran · 2025-12-19T14:26:47Z

Further changes are presented in PR #215 .

guosran · 2025-12-19T14:38:15Z

As updating all the test files for each PR is super time-consuming, could we merge the changes altogether in PR #215 ? thx : ）

tancheng reviewed Nov 27, 2025

View reviewed changes

lib/NeuraDialect/Mapping/mapping_util.cpp Show resolved Hide resolved

guosran commented Nov 27, 2025

View reviewed changes

guosran marked this pull request as draft November 27, 2025 04:29

tancheng approved these changes Nov 27, 2025

View reviewed changes

guosran marked this pull request as ready for review December 18, 2025 04:15

Copilot AI review requested due to automatic review settings December 18, 2025 04:15

Copilot started reviewing on behalf of guosran December 18, 2025 04:16 View session

Copilot AI reviewed Dec 18, 2025

View reviewed changes

test/c2llvm2mlir/nested_loop/test.mlir Outdated Show resolved Hide resolved

lib/NeuraDialect/Mapping/mapping_util.cpp Show resolved Hide resolved

guosran marked this pull request as draft December 18, 2025 04:31

tancheng approved these changes Dec 18, 2025

View reviewed changes

Fix: canReachLocInTime accounts for register waits and update backtra…

3be9e46

…ck-config to 'customized'

guosran force-pushed the fix/canReach-and-backtrack branch 3 times, most recently from d7d52e3 to 3be9e46 Compare December 19, 2025 14:18

guosran mentioned this pull request Dec 19, 2025

Fix canReachLocInTime to account for register waits and refactor calculateAward #215

Merged

guosran closed this Dec 19, 2025

Fix canReachLocInTime and Customize backtrack-config to Reduce II #195

Fix canReachLocInTime and Customize backtrack-config to Reduce II #195

Uh oh!

Conversation

guosran commented Nov 25, 2025

Summary

Key change

Results:

Uh oh!

ShangkunLi commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guosran commented Nov 26, 2025

Uh oh!

Uh oh!

guosran Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

tancheng Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

tancheng commented Dec 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

guosran commented Dec 18, 2025

Uh oh!

guosran commented Dec 18, 2025

Uh oh!

tancheng Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

guosran Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

tancheng Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

guosran Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

guosran commented Dec 18, 2025

Uh oh!

guosran commented Dec 19, 2025

Uh oh!

guosran commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ShangkunLi commented Nov 26, 2025 •

edited

Loading