Skip to content

docs(qnn_backend): update AOT execution flow documentation#628

Merged
chenghuaWang merged 2 commits intoUbiquitousLearning:mainfrom
oreomaker:qnn-aot
Feb 3, 2026
Merged

docs(qnn_backend): update AOT execution flow documentation#628
chenghuaWang merged 2 commits intoUbiquitousLearning:mainfrom
oreomaker:qnn-aot

Conversation

@oreomaker
Copy link
Collaborator

@oreomaker oreomaker commented Feb 3, 2026

Please check Guidelines for Contributing.

Summary by CodeRabbit

  • Documentation
    • Updated AOT execution workflow documentation to reflect changes in export formats and compilation steps for quantized neural network backends.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

Documentation update to the AOT execution guide reflecting changes in the model export and compilation workflow. The three-stage flow description now documents use of .safetensor and .mllm file formats instead of .mir, with mllm-convertor converting between formats during the Model Quantization and Export step.

Changes

Cohort / File(s) Summary
Documentation
docs/qnn_backend/aot_execute.rst
Updated three-stage flow descriptions: Model Quantization now exports to .safetensor with subsequent .mllm conversion via mllm-convertor; Offline Compilation loads .mllm file instead of .mir.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 A format found its way through gates,
From mir to safetensor, it awaits,
Convertor works its magic spell,
Now mllm's story we shall tell,
Documentation shines so bright! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description only contains the repository's contribution template boilerplate with a link to guidelines, but provides no substantive information about the changes, rationale, or impact. Add a meaningful description explaining what was updated in the AOT execution flow documentation, why these changes were necessary, and any relevant context for reviewers.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: updating AOT execution flow documentation for the qnn_backend component.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@docs/qnn_backend/aot_execute.rst`:
- Line 21: Replace the incorrect singular extension ".safetensor" with the
correct plural ".safetensors" in the sentence that describes exporting the
quantized model (the line mentioning "export it to ``.safetensor`` file");
ensure this matches the other occurrences of ".safetensors" in the document for
consistency.


1. **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to the MLLM IR (``.mir``) format.
2. **Offline Compilation (C++)**: On the host machine, a C++ compiler program loads the ``.mir`` file, invokes the QNN toolchain for model compilation, graph optimization, and quantization parameter adjustment, and finally generates a QNN Context Binary.
1. **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to ``.safetensor`` file. The ``.safetensor`` is then converted to ``.mllm`` file using mllm-convertor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix file extension inconsistency.

The file extension should be .safetensors (plural) to match the standard format name and maintain consistency with the rest of the document (lines 33, 42, 44, 49 all use .safetensors).

📝 Proposed fix
-1.  **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to ``.safetensor`` file. The ``.safetensor`` is then converted to ``.mllm`` file using mllm-convertor.
+1.  **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to ``.safetensors`` file. The ``.safetensors`` is then converted to ``.mllm`` file using mllm-convertor.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
1. **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to ``.safetensor`` file. The ``.safetensor`` is then converted to ``.mllm`` file using mllm-convertor.
1. **Model Quantization and Export (Python)**: On the host machine, a Python script is used to quantize the pre-trained floating-point model and export it to ``.safetensors`` file. The ``.safetensors`` is then converted to ``.mllm`` file using mllm-convertor.
🤖 Prompt for AI Agents
In `@docs/qnn_backend/aot_execute.rst` at line 21, Replace the incorrect singular
extension ".safetensor" with the correct plural ".safetensors" in the sentence
that describes exporting the quantized model (the line mentioning "export it to
``.safetensor`` file"); ensure this matches the other occurrences of
".safetensors" in the document for consistency.

Copy link
Collaborator

@chenghuaWang chenghuaWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chenghuaWang chenghuaWang merged commit 2a5a950 into UbiquitousLearning:main Feb 3, 2026
2 checks passed
@oreomaker oreomaker deleted the qnn-aot branch February 5, 2026 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants