-
Notifications
You must be signed in to change notification settings - Fork 742
Add Run Android LlamaDemo with QNN backend #16011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16011
Note: Links to docs will display an error until the docs builds have been completed. ❗ 2 Active SEVsThere are 2 currently active SEVs. If your PR is affected, please view them below:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @luffy-yu! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive documentation for building and running the Android LlamaDemo application with Qualcomm's QNN backend support. Previously, the Android demo only supported XNNPACK, and this guide fills that gap by providing step-by-step instructions verified on Samsung S23 hardware.
Key changes:
- Added complete build and deployment guide for QNN-enabled Android LlamaDemo
- Included verification steps to confirm correct AAR and APK configuration
- Documented common issues and their solutions for QNN Android deployment
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend | ||
|
|
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The external link points to 'meta-pytorch/executorch-examples' but the PR is in the 'pytorch/executorch' repository. Verify this is the correct repository reference for the LlamaDemo, as it may confuse users if the link is incorrect or if the demo has moved.
| ### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend | |
| ### Run [Android LlamaDemo](https://github.com/pytorch/executorch/tree/main/examples/llm/android/LlamaDemo) with QNN backend | |
| > **Note:** If the LlamaDemo is not present in the `pytorch/executorch` repository, you can find it in the [meta-pytorch/executorch-examples](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) repository. |
| # Build the AAR | ||
| cd $EXECUTORCH_ROOT | ||
| export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out | ||
| ./scripts/build_android_library.sh |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script invocation lacks context about required environment variables or prerequisites. Consider adding a note about any required environment setup (e.g., Android NDK path, QNN SDK path) before running this script, or reference existing documentation that covers these prerequisites.
| adb push model.pte /data/local/tmp/llama | ||
| adb push tokenizer.bin /data/local/tmp/llama |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation doesn't specify where to obtain 'model.pte' and 'tokenizer.bin'. Add a reference to instructions for generating or downloading these required files, especially since the QNN backend requires specific model compilation steps.
docs/source/backends-qualcomm.md
Outdated
| # Check for debug strings in the AAR | ||
| unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \ | ||
| strings | grep "YOUR DEBUG INFO" |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The placeholder 'YOUR DEBUG INFO' is ambiguous and doesn't provide actionable guidance. Either provide a concrete example of what debug string to search for, or explain what type of debug information users should add to their code for verification purposes.
| # Check for debug strings in the AAR | |
| unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \ | |
| strings | grep "YOUR DEBUG INFO" | |
| # Check for your custom debug string in the AAR. | |
| # For example, if you added a debug string like "MyCustomDebugString" to your code, | |
| unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \ | |
| strings | grep "MyCustomDebugString" | |
| # Replace "MyCustomDebugString" with the actual debug string you added to your code. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| **Solution**: | ||
| 1. Update `build.gradle.kts` with matching QNN runtime version | ||
|
|
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note appears after the 'Before' label but should appear before showing any code examples. Moving it above line 395 would make the documentation flow more clearly and prevent confusion about which versions to use.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so" | ||
| ``` | ||
|
|
||
| Expected size for QNN 2.37.0: ~2,465,440 bytes |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The expected file size is overly specific and may vary slightly between builds or minor version updates. Consider changing this to a range (e.g., '~2.4-2.5 MB') to make the verification more resilient to minor variations.
| Expected size for QNN 2.37.0: ~2,465,440 bytes | |
| Expected size for QNN 2.37.0: ~2.4–2.5 MB (2,400,000–2,500,000 bytes) |
| **Solution**: | ||
| 1. Update `build.gradle.kts` with matching QNN runtime version | ||
|
|
||
| > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The note about version numbers being examples appears after mentioning 'version X' and 'version Y' in the previous text but before showing the actual version numbers. This note would be more effective if placed immediately after the code examples (after line 403) where the specific versions are shown, or reference should be made to it in the text above where X and Y are mentioned.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. | ||
| **Before**: | ||
| ```kotlin | ||
| implementation("com.qualcomm.qti:qnn-runtime:2.33.0") | ||
| ``` | ||
|
|
||
| **After**: | ||
| ```kotlin | ||
| implementation("com.qualcomm.qti:qnn-runtime:2.37.0") | ||
| ``` |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The numbered list item '1.' is followed by a note and code examples, but there's no item '2.' at the same indentation level until line 405. The note and code examples should be indented or formatted as sub-items of item 1 to maintain proper list structure.
| > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. | |
| **Before**: | |
| ```kotlin | |
| implementation("com.qualcomm.qti:qnn-runtime:2.33.0") | |
| ``` | |
| **After**: | |
| ```kotlin | |
| implementation("com.qualcomm.qti:qnn-runtime:2.37.0") | |
| ``` | |
| > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. | |
| **Before**: | |
| ```kotlin | |
| implementation("com.qualcomm.qti:qnn-runtime:2.33.0") | |
| ``` | |
| **After**: | |
| ```kotlin | |
| implementation("com.qualcomm.qti:qnn-runtime:2.37.0") | |
| ``` |
Motivation
qnn_llama_runnerapproach is provided though.Summary
Test plan