Skip to content

refactor(codegen): Replace repetitive N-ary op registration with table-driven approach#274

Merged
Hzfengsy merged 1 commit intohw-native-sys:mainfrom
lwDavid:refactor
Feb 26, 2026
Merged

refactor(codegen): Replace repetitive N-ary op registration with table-driven approach#274
Hzfengsy merged 1 commit intohw-native-sys:mainfrom
lwDavid:refactor

Conversation

@lwDavid
Copy link
Contributor

@lwDavid lwDavid commented Feb 26, 2026

refactor(codegen): Replace repetitive N-ary op registration with table-driven approach

Merge 5 identical MakeUnary/Binary/Ternary/Quaternary/QuinaryCodegenPTO
helper functions into a single MakeNaryCodegenPTO parameterized by arity.
Replace ~60 repetitive REGISTER_BACKEND_OP calls with a SimpleOpEntry table
and loop-based registration, reducing the file from ~970 to ~560 lines.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @lwDavid, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the codegen backend by streamlining the registration of N-ary operations. The change consolidates multiple specialized helper functions into a single, parameterized function and adopts a table-driven approach for operation registration. This significantly reduces boilerplate code, improving the clarity and maintainability of the codebase, and making it easier to add or modify operations in the future.

Highlights

  • Consolidated N-ary Helper Functions: Merged five specialized helper functions (MakeUnaryCodegenPTO, MakeBinaryCodegenPTO, etc.) into a single, generalized MakeNaryCodegenPTO function that accepts an arity parameter.
  • Table-Driven Operation Registration: Replaced approximately 60 repetitive REGISTER_BACKEND_OP calls with a structured table (kSimpleOps) and a loop-based registration mechanism, significantly reducing boilerplate.
  • Code Size Reduction: Reduced the file size from approximately 970 lines to 560 lines, enhancing code conciseness and maintainability.
Changelog
  • src/backend/910B_PTO/backend_910b_pto_ops.cpp
    • Replaced five specific MakeXCodegenPTO helper functions with a single MakeNaryCodegenPTO function.
    • Introduced a SimpleOpEntry struct and kSimpleOps array to define N-ary operations and their arities.
    • Implemented a loop-based registration mechanism in RegisterSimpleOps to register numerous backend operations from the kSimpleOps table.
    • Removed approximately 60 individual REGISTER_BACKEND_OP calls that were previously used for N-ary operations.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

Consolidates per-arity PTO codegen helpers into a single table-driven registration and emission path. Adds MakeNaryCodegenPTO, a SimpleOpEntry/kSimpleOps table, and RegisterSimpleOps to auto-register many simple PTO ops; retains specialized paths for complex ops.

Changes

Cohort / File(s) Summary
PTO Codegen Unification
src/backend/910B_PTO/backend_910b_pto_ops.cpp
Removed per-arity helpers (MakeUnary/Binary/Ternary/Quaternary/QuinaryCodegenPTO). Added MakeNaryCodegenPTO, MakeStoreFPCodegenPTO, SimpleOpEntry, kSimpleOps table, and RegisterSimpleOps with kSimpleOpsRegistered to auto-register many simple ops via a loop. Kept specialized emission for complex ops (block.load/store, gather/scatter, matrix/tile, sorts), adjusted a few explicit registrations and left TODOs for multi-output sorts.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • Hzfengsy

Poem

🐇 I hopped through helpers, one by one,
Tuned them into a table—now unified, done.
Loop-registered ops in tidy array,
A rabbit's small dance — code basks in the day. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main refactoring effort: consolidating repetitive N-ary operation registration helpers into a single table-driven approach.
Description check ✅ Passed The description directly relates to the changeset, clearly explaining the consolidation of five helper functions, replacement of ~60 registration calls with a table-driven approach, and reduction of file size.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent refactoring that significantly improves the codebase. By replacing dozens of repetitive REGISTER_BACKEND_OP macro calls with a single table-driven approach, you've greatly reduced code duplication and made the system more maintainable. The new MakeNaryCodegenPTO function and the kSimpleOps table are clear and effective. I have one minor suggestion to improve the readability of the static registration.

}
}

static bool kSimpleOpsRegistered = (RegisterSimpleOps(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While using the comma operator to trigger the registration function during static initialization works, it can be a bit obscure. An immediately-invoked lambda expression (IILE) would make the intent clearer and is a more modern C++ idiom for this pattern. Also, marking the variable const is a good practice since it's only written to once.

Suggested change
static bool kSimpleOpsRegistered = (RegisterSimpleOps(), true);
static const bool kSimpleOpsRegistered = [] { RegisterSimpleOps(); return true; }();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/backend/910B_PTO/backend_910b_pto_ops.cpp (1)

230-231: ⚠️ Potential issue | 🟡 Minor

Fix missing bracket in error message.

The error message is missing the opening bracket [ before the operation name. All other helper functions use "Operation:[" format consistently.

Proposed fix
-  CHECK(op->args_.size() == 1) << "Operation:" << pto_op_name << "] requires 1 argument, but got "
+  CHECK(op->args_.size() == 1) << "Operation:[" << pto_op_name << "] requires 1 argument, but got "
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp` around lines 230 - 231, The
CHECK message is missing the opening '[' before the operation name; update the
CHECK that validates op->args_.size() to use the same "Operation:[" format as
other helpers by inserting a '[' before pto_op_name in the formatted string (the
CHECK statement referencing op->args_.size() and pto_op_name).
🧹 Nitpick comments (1)
src/backend/910B_PTO/backend_910b_pto_ops.cpp (1)

199-224: Sorting operations emit incomplete codegen.

The MakeSort32CodegenPTO and MakeMrgSortCodegenPTO helpers only emit the operation name without ins()/outs() clauses, as indicated by the commented-out code. The TODO notes this is due to multiple outputs not being addressed yet.

Would you like me to open an issue to track implementing proper multi-output support for sorting operations?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp` around lines 199 - 224,
MakeSort32CodegenPTO and MakeMrgSortCodegenPTO currently emit only the PTO name;
fix each to emit a full pto invocation including ins(...) and outs(...) by using
the codegen helpers instead of leaving commented lines. For MakeSort32CodegenPTO
use codegen.GetExprAsCode(op->args_[0]) to build the ins list and
codegen.GetCurrentResultTarget() (or iterate op->outputs_ if Sort32 has multiple
outputs) to build outs, then call codegen.Emit with the assembled string
"pto_op_name + ins_clause + outs_clause". For MakeMrgSortCodegenPTO use
codegen.GetExprAsCode on both op->args_[0] and op->args_[1] for inputs, and
similarly collect outputs from GetCurrentResultTarget() or op->outputs_ to form
the outs(...) clause, then emit the full PTO. Ensure the CHECKs and comments
reflect expected arg/output counts and handle multiple outputs by iterating
op->outputs_ when present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp`:
- Around line 230-231: The CHECK message is missing the opening '[' before the
operation name; update the CHECK that validates op->args_.size() to use the same
"Operation:[" format as other helpers by inserting a '[' before pto_op_name in
the formatted string (the CHECK statement referencing op->args_.size() and
pto_op_name).

---

Nitpick comments:
In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp`:
- Around line 199-224: MakeSort32CodegenPTO and MakeMrgSortCodegenPTO currently
emit only the PTO name; fix each to emit a full pto invocation including
ins(...) and outs(...) by using the codegen helpers instead of leaving commented
lines. For MakeSort32CodegenPTO use codegen.GetExprAsCode(op->args_[0]) to build
the ins list and codegen.GetCurrentResultTarget() (or iterate op->outputs_ if
Sort32 has multiple outputs) to build outs, then call codegen.Emit with the
assembled string "pto_op_name + ins_clause + outs_clause". For
MakeMrgSortCodegenPTO use codegen.GetExprAsCode on both op->args_[0] and
op->args_[1] for inputs, and similarly collect outputs from
GetCurrentResultTarget() or op->outputs_ to form the outs(...) clause, then emit
the full PTO. Ensure the CHECKs and comments reflect expected arg/output counts
and handle multiple outputs by iterating op->outputs_ when present.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13bb385 and 15c21a4.

📒 Files selected for processing (1)
  • src/backend/910B_PTO/backend_910b_pto_ops.cpp

@lwDavid
Copy link
Contributor Author

lwDavid commented Feb 26, 2026

/gemini review

@lwDavid
Copy link
Contributor Author

lwDavid commented Feb 26, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent refactoring that significantly improves the maintainability of the operator registration code. Replacing the repetitive helper functions and macro calls with a single generic function and a table-driven approach greatly reduces boilerplate and makes it much easier to add or modify simple N-ary operations in the future. The code is cleaner and more concise. The minor suggestion to use C++14 init-captures is kept as it does not contradict any existing rules and offers a modern C++ idiom.

Comment on lines +463 to +470
std::string pto_op = entry.pto_op_name;
size_t arity = entry.arity;
Backend910B_PTO::Instance()
.RegisterOp(entry.op_name)
.set_pipe(PipeType::V)
.f_codegen([pto_op, arity](const CallPtr& op, codegen::CodegenBase& codegen) {
return MakeNaryCodegenPTO(pto_op, arity, op, codegen);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For improved conciseness and to leverage modern C++ features, you can use C++14's init-captures in the lambda. This avoids creating separate local variables (pto_op, arity) just for the capture, making the intent clearer within the lambda itself.

    Backend910B_PTO::Instance()
        .RegisterOp(entry.op_name)
        .set_pipe(PipeType::V)
        .f_codegen([pto_op = std::string(entry.pto_op_name), arity = entry.arity](
                       const CallPtr& op, codegen::CodegenBase& codegen) {
          return MakeNaryCodegenPTO(pto_op, arity, op, codegen);
        });

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/backend/910B_PTO/backend_910b_pto_ops.cpp (2)

461-477: Add a duplicate-op guard in RegisterSimpleOps.

Given the table size, accidental duplicate op_name entries are easy to introduce. A fail-fast check here prevents silent conflicts.

♻️ Suggested patch
+#include <unordered_set>
...
 static void RegisterSimpleOps() {
+  std::unordered_set<std::string> seen;
   for (const auto& entry : kSimpleOps) {
+    CHECK(seen.insert(entry.op_name).second)
+        << "Duplicate simple op registration: " << entry.op_name;
     std::string pto_op = entry.pto_op_name;
     size_t arity = entry.arity;
     Backend910B_PTO::Instance()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp` around lines 461 - 477,
RegisterSimpleOps currently blindly registers every entry from kSimpleOps which
can silently allow duplicate op_name entries; add a fail-fast duplicate-op guard
by keeping a local std::unordered_set<std::string> seen; before calling
Backend910B_PTO::Instance().RegisterOp(entry.op_name) check if entry.op_name is
already in seen and if so throw or abort with a clear message including the
duplicated op_name, otherwise insert into seen and proceed to call RegisterOp
and set_pipe/f_codegen as before (affects RegisterSimpleOps and initialization
via kSimpleOpsRegistered).

543-555: Please track the sort multi-output TODO with a follow-up issue.

The TODOs at Line 543 and Line 550 indicate a known functional gap for sort ops. Linking an issue in this PR would reduce the risk of it being lost.

If you want, I can draft a concise issue template with acceptance checks for block.sort32 and block.mrgsort.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp` around lines 543 - 555, Add a
follow-up issue for the unhandled multi-output behavior for sorting ops and
update the TODO comments beside the REGISTER_BACKEND_OP entries for
"block.sort32" and "block.mrgsort" to reference that issue; specifically, create
an issue describing required work for MakeSort32CodegenPTO and
MakeMrgSortCodegenPTO to support multiple outputs (include acceptance criteria:
reproduce failing case, implement multi-output handling, unit tests for
block.sort32 and block.mrgsort), then edit the TODO lines that mention sorting
multi-output to append the issue number or URL so future readers can track the
task.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp`:
- Around line 525-529: The code in MakeCmpsCodegenPTO is reading the wrong kwarg
name: replace the call to GetKwarg<int>("mode") with GetKwarg<int>("cmp_type")
so it matches the block.cmps operation attribute and the CCE backend; update any
variable/documentation in MakeCmpsCodegenPTO that assumes the name "mode" to use
"cmp_type" consistently (refer to the MakeCmpsCodegenPTO function and its use in
the REGISTER_BACKEND_OP for Backend910B_PTO).

---

Nitpick comments:
In `@src/backend/910B_PTO/backend_910b_pto_ops.cpp`:
- Around line 461-477: RegisterSimpleOps currently blindly registers every entry
from kSimpleOps which can silently allow duplicate op_name entries; add a
fail-fast duplicate-op guard by keeping a local std::unordered_set<std::string>
seen; before calling Backend910B_PTO::Instance().RegisterOp(entry.op_name) check
if entry.op_name is already in seen and if so throw or abort with a clear
message including the duplicated op_name, otherwise insert into seen and proceed
to call RegisterOp and set_pipe/f_codegen as before (affects RegisterSimpleOps
and initialization via kSimpleOpsRegistered).
- Around line 543-555: Add a follow-up issue for the unhandled multi-output
behavior for sorting ops and update the TODO comments beside the
REGISTER_BACKEND_OP entries for "block.sort32" and "block.mrgsort" to reference
that issue; specifically, create an issue describing required work for
MakeSort32CodegenPTO and MakeMrgSortCodegenPTO to support multiple outputs
(include acceptance criteria: reproduce failing case, implement multi-output
handling, unit tests for block.sort32 and block.mrgsort), then edit the TODO
lines that mention sorting multi-output to append the issue number or URL so
future readers can track the task.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15c21a4 and e21b31e.

📒 Files selected for processing (1)
  • src/backend/910B_PTO/backend_910b_pto_ops.cpp

@lwDavid
Copy link
Contributor Author

lwDavid commented Feb 26, 2026

@Hzfengsy Request review. Proposed a refactoring solution.

@Hzfengsy Hzfengsy merged commit 06929a7 into hw-native-sys:main Feb 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants