[WIP] feat(codegen): Add runtime stride-based tensor offset computation and… by YunjiQin · Pull Request #197 · hw-native-sys/pypto

YunjiQin · 2026-02-13T10:15:06Z

Summary

Add Tensor struct pointer tracking in CodeContext
Add runtime stride-based tensor offset computation for block.load, block.store, block.l0c_store codegen
Implement codegen for tensor.dim operations
Support dynamic strides (-1) in GenerateStrideType
Update tests to reflect dynamic stride behavior

… codegen for tensor.dim Implements dynamic stride-based offset computation for CCE codegen, replacing compile-time row-major stride calculation with runtime stride access from Tensor struct. Key changes: - Add Tensor struct pointer tracking in CodeContext - Update block.load, block.store, block.l0c_store to use runtime strides - Add codegen function for tensor.dim op - Change GenerateStrideType to emit dynamic strides (-1) - Update tests to reflect dynamic stride behavior

gemini-code-assist · 2026-02-13T10:15:31Z

Summary of Changes

Hello @YunjiQin, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the CCE code generation framework by introducing robust support for dynamic tensor operations. It enables the system to handle variable tensor shapes and memory layouts more effectively at runtime through stride-based offset computations and the ability to query tensor dimensions. This change improves the flexibility and adaptability of the generated code for various tensor processing scenarios.

Highlights

Dynamic Stride-Based Tensor Offset Computation: Implemented runtime stride-based offset calculation for block.load, block.store, and block.l0c_store operations, enabling more flexible tensor memory access.
Tensor Struct Pointer Tracking: Introduced a mechanism within CodeContext to track and manage Tensor struct pointers, which encapsulate buffer address, shape, and stride information.
Tensor Dimension Query Codegen: Added code generation support for the tensor.dim operation, allowing runtime querying of tensor dimensions.
Dynamic Stride Generation: Modified GenerateStrideType to produce dynamic stride placeholders (-1) for runtime resolution, rather than fixed compile-time values.

Changelog

include/pypto/codegen/cce/cce_codegen.h
- Added declarations for GetTensorStruct and RegisterOutputTensorStruct methods.
- Modified GenerateGlobalTensorTypeDeclaration to accept an optional tensor_struct_ptr.
include/pypto/codegen/cce/code_context.h
- Declared RegisterTensorStruct and GetTensorStruct methods.
- Added tensor_to_struct_pointer_ map to track tensor struct pointers.
include/pypto/codegen/cce/type_converter.h
- Removed the private CalculateRowMajorStrides helper method.
- Updated the GenerateStrideType signature.
src/backend/910B_CCE/backend_910b_cce_ops.cpp
- Introduced a static helper function ComputeStrideBasedOffset for dynamic offset calculation.
- Refactored MakeBlockLoadCodegenCCE, MakeBlockStoreCodegenCCE, and MakeBlockL0CStoreCodegenCCE to use the new stride-based offset computation and Tensor struct access.
- Added RegisterOutputTensorStruct calls in MakeBlockStoreCodegenCCE and MakeBlockL0CStoreCodegenCCE.
- Implemented MakeTensorDimCodegenCCE and registered it for the tensor.dim operation.
src/codegen/cce/cce_codegen.cpp
- Updated GeneratePrologue to pass the tensor struct pointer during global tensor declaration.
- Modified VisitStmt_ for IfStmt and ForStmt to ensure tensor struct pointer mappings are inherited.
- Implemented GetTensorStruct and RegisterOutputTensorStruct methods.
- Updated GenerateGlobalTensorTypeDeclaration to initialize GlobalTensor with dynamic strides from the tensor_struct_ptr.
src/codegen/cce/code_context.cpp
- Implemented RegisterTensorStruct and GetTensorStruct methods.
- Cleared the tensor_to_struct_pointer_ map in the Clear method.
src/codegen/cce/type_converter.cpp
- Removed the implementation of CalculateRowMajorStrides.
- Modified GenerateStrideType to generate -1 for dynamic strides instead of computed values.
tests/ut/codegen/test_type_converter.py
- Updated test_generate_stride to assert expected dynamic stride patterns (-1) for various tensor shapes.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant feature for runtime stride-based tensor offset computation, which is crucial for handling dynamic tensor shapes in the CCE codegen. The changes are well-organized across the codegen engine, backend operator implementations, and context management. The implementation of tensor.dim and the updates to block.load/store to use the new dynamic offset calculation are solid. My review has identified a few minor areas for improvement, including removing unused variables/parameters, correcting documentation to match implementation, and addressing a potential signed/unsigned comparison issue. All comments align with existing guidelines or are not covered by specific rules, and thus no modifications or removals were necessary.

gemini-code-assist · 2026-02-13T10:17:15Z

src/backend/910B_CCE/backend_910b_cce_ops.cpp

+static std::string ComputeStrideBasedOffset(codegen::CCECodegen& codegen, const std::string& tensor_var_name,
+                                            ir::MakeTuplePtr offsets, const ir::TensorTypePtr& tensor_type) {


The tensor_type parameter is unused within this function. It's good practice to remove unused parameters to keep the function signature clean and improve code clarity.

Suggested change

static std::string ComputeStrideBasedOffset(codegen::CCECodegen& codegen, const std::string& tensor_var_name,

ir::MakeTuplePtr offsets, const ir::TensorTypePtr& tensor_type) {

static std::string ComputeStrideBasedOffset(codegen::CCECodegen& codegen, const std::string& tensor_var_name,

ir::MakeTuplePtr offsets) {

gemini-code-assist · 2026-02-13T10:17:15Z

src/backend/910B_CCE/backend_910b_cce_ops.cpp

+static std::string MakeTensorDimCodegenCCE(const ir::CallPtr& op, codegen::CodegenBase& codegen_base) {
+  auto& codegen = dynamic_cast<codegen::CCECodegen&>(codegen_base);
+  std::string target_var = codegen.GetCurrentResultTarget();
+  std::string input_var = codegen.GetExprAsCode(op->args_[0]);


The variable input_var is initialized but never used. It should be removed to avoid confusion and clean up the code.

gemini-code-assist · 2026-02-13T10:17:15Z

src/codegen/cce/cce_codegen.cpp

  }
+  if (tensor_struct_ptr.has_value()) {
+    global_instance << ", {}, {";
+    for (int i = 0; i < shape_dims.size(); i++) {


The loop counter i is of type int while shape_dims.size() returns size_t. This can lead to signed/unsigned comparison warnings from the compiler and is generally not safe. It's best practice to use size_t for loop counters that iterate over container sizes.

Suggested change

for (int i = 0; i < shape_dims.size(); i++) {

for (size_t i = 0; i < shape_dims.size(); i++) {

gemini-code-assist · 2026-02-13T10:17:16Z

include/pypto/codegen/cce/code_context.h

+   * Returns the Tensor struct pointer name that should be used for accessing
+   * buffer address and stride information. If no mapping exists, returns the
+   * input tensor_var_name itself (for compatibility).


The documentation for GetTensorStruct is inconsistent with its implementation. The documentation states that it returns the input tensor_var_name if no mapping exists, but the implementation in code_context.cpp throws an error using CHECK. The documentation should be updated to match the implementation's fail-fast behavior, which is safer.

Suggested change

* Returns the Tensor struct pointer name that should be used for accessing

* buffer address and stride information. If no mapping exists, returns the

* input tensor_var_name itself (for compatibility).

* Returns the Tensor struct pointer name that should be used for accessing

* buffer address and stride information. Throws an error if no mapping exists.

gemini-code-assist bot reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[WIP] feat(codegen): Add runtime stride-based tensor offset computation and…#197

[WIP] feat(codegen): Add runtime stride-based tensor offset computation and…#197
YunjiQin wants to merge 1 commit intohw-native-sys:mainfrom
YunjiQin:codegen_v2

YunjiQin commented Feb 13, 2026

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		static std::string ComputeStrideBasedOffset(codegen::CCECodegen& codegen, const std::string& tensor_var_name,
		ir::MakeTuplePtr offsets, const ir::TensorTypePtr& tensor_type) {

	for (int i = 0; i < shape_dims.size(); i++) {
	for (size_t i = 0; i < shape_dims.size(); i++) {

Comments

Conversation

YunjiQin commented Feb 13, 2026

Summary

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant