From 29f233a4f379965c328fade9b868fab767a8bf9b Mon Sep 17 00:00:00 2001 From: Liuchuan Yu Date: Sun, 30 Nov 2025 21:59:40 -0500 Subject: [PATCH 1/7] Add Run Android LlamaDemo with QNN backend --- docs/source/backends-qualcomm.md | 131 +++++++++++++++++++++++++++++++ 1 file changed, 131 insertions(+) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index 6c5397f02be..716e504279b 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -288,6 +288,137 @@ After the above command, pre-processed inputs and outputs are put in `$EXECUTORC The command-line arguments are written in [utils.py](https://github.com/pytorch/executorch/blob/main/examples/qualcomm/utils.py#L139). The model, inputs, and output location are passed to `qnn_executorch_runner` by `--model_path`, `--input_list_path`, and `--output_folder_path`. +### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend + +`$DEMO_APP` refers to the root of executorch android demo root, i.e., the directory containing `build.gradle.kts`. + +***Step 1***: Rebuild ExecuTorch AAR + +```bash +# Build the AAR +cd $EXECUTORCH_ROOT +export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out +./scripts/build_android_library.sh +``` + +***Step 2***: Copy AAR to Android Project + +```bash +cp $EXECUTORCH_ROOT/aar-out/executorch.aar \ + $DEMO_APP/app/libs/executorch.aar +``` + +***Step 3***: Build Android APK + +```bash +cd $DEMO_APP +./gradlew clean assembleDebug -PuseLocalAar=true +``` + +***Step 4***: Install on Device + +```bash +adb install -r app/build/outputs/apk/debug/app-debug.apk +``` + +***Step 5***: Push model + +```bash +adb shell mkdir -p /data/local/tmp/llama +adb push model.pte /data/local/tmp/llama +adb push tokenizer.bin /data/local/tmp/llama +``` + +***Step 5***: Run the Llama Demo + +- Open the App on Android +- Select `QUALCOMM` backend +- Select `model.pte` Model +- Select `tokenizer.bin` Tokenizer +- Select Model Type +- Click LOAD MODEL +- It should show `Successfully loaded model.` + + +#### Verification Steps + +***Step 1***. Verify AAR Contains Your Changes + +```bash +# Check for debug strings in the AAR +unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \ + strings | grep "YOUR DEBUG INFO" +``` + +If found, your changes are in the AAR. + +***Step 2***. Verify APK Contains Correct Libraries + +```bash +# Check QNN library version in APK +cd $DEMO_APP +unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so" +``` + +Expected size for QNN 2.37.0: ~2,465,440 bytes + +***Step 3***. Monitor Logs During Model Loading + +```bash +adb logcat -c +adb logcat | grep -E "ExecuTorch" +``` + +#### Common Issues and Solutions + +##### Issue 1: Error 18 (InvalidArgument) + +**Cause**: Wrong parameter order in Runner constructor or missing QNN config + +**Solution**: Check `$EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/llama/runner/runner.h` for the correct constructor signature. + +##### Issue 2: Error 1 (Internal) with QNN API Version Mismatch + +**Symptoms**: +``` +W [Qnn ExecuTorch]: Qnn API version X.XX.X is mismatched +E [Qnn ExecuTorch]: Using newer context binary on old SDK +E [Qnn ExecuTorch]: Can't create context from binary. Error 5000 +``` + +**Cause**: Model compiled with QNN SDK version X but APK uses QNN runtime version Y + +**Solution**: +1. Update `build.gradle.kts` with matching QNN runtime version + +**Before**: +```kotlin +implementation("com.qualcomm.qti:qnn-runtime:2.33.0") +``` + +**After**: +```kotlin +implementation("com.qualcomm.qti:qnn-runtime:2.37.0") +``` + +2. Or recompile model with matching QNN SDK version + +##### Issue 3: Native Code Changes Not Applied + +**Symptoms**: Debug logs don't appear, behavior doesn't change + +**Cause**: Gradle using Maven dependency instead of local AAR + +**Solution**: Always build with `-PuseLocalAar=true` flag + +##### Issue 4: Logs Not Appearing + +**Cause**: Wrong logging tag filter + +**Solution**: QNN uses "ExecuTorch" tag: +```bash +adb logcat | grep "ExecuTorch" +``` ## Supported model list From 2c39ea101ae2b17a9e58d256a0f23d81fc44f0e6 Mon Sep 17 00:00:00 2001 From: Liuchuan Yu <38197351+luffy-yu@users.noreply.github.com> Date: Sun, 30 Nov 2025 22:11:42 -0500 Subject: [PATCH 2/7] Fix ***Step 6***: Run the Llama Demo Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/source/backends-qualcomm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index 716e504279b..479343defa2 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -329,7 +329,7 @@ adb push model.pte /data/local/tmp/llama adb push tokenizer.bin /data/local/tmp/llama ``` -***Step 5***: Run the Llama Demo +***Step 6***: Run the Llama Demo - Open the App on Android - Select `QUALCOMM` backend From 5ba71962143a9e212739eb6bd5230fea1573fa39 Mon Sep 17 00:00:00 2001 From: Liuchuan Yu <38197351+luffy-yu@users.noreply.github.com> Date: Sun, 30 Nov 2025 22:12:38 -0500 Subject: [PATCH 3/7] Update $DEMO_APP definition Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/source/backends-qualcomm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index 479343defa2..56b240fb094 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -290,7 +290,7 @@ The model, inputs, and output location are passed to `qnn_executorch_runner` by ### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend -`$DEMO_APP` refers to the root of executorch android demo root, i.e., the directory containing `build.gradle.kts`. +`$DEMO_APP` refers to the root of the executorch android demo, i.e., the directory containing `build.gradle.kts`. ***Step 1***: Rebuild ExecuTorch AAR From c67fca4c23ece2a32c9a13fe634e3a3e0591f19b Mon Sep 17 00:00:00 2001 From: Liuchuan Yu <38197351+luffy-yu@users.noreply.github.com> Date: Sun, 30 Nov 2025 22:14:02 -0500 Subject: [PATCH 4/7] Update version numbers example Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/source/backends-qualcomm.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index 56b240fb094..4f95ddc97c1 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -391,6 +391,7 @@ E [Qnn ExecuTorch]: Can't create context from binary. Error 5000 **Solution**: 1. Update `build.gradle.kts` with matching QNN runtime version +> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. **Before**: ```kotlin implementation("com.qualcomm.qti:qnn-runtime:2.33.0") From 975c3718b25a28a530fc73b58008f885ab53f58a Mon Sep 17 00:00:00 2001 From: Liuchuan Yu <38197351+luffy-yu@users.noreply.github.com> Date: Sun, 30 Nov 2025 22:14:39 -0500 Subject: [PATCH 5/7] Update grep "QNN" Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/source/backends-qualcomm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index 4f95ddc97c1..efb6ae74c30 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -347,7 +347,7 @@ adb push tokenizer.bin /data/local/tmp/llama ```bash # Check for debug strings in the AAR unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \ - strings | grep "YOUR DEBUG INFO" + strings | grep "QNN" # Replace "QNN" with your actual debug string if needed ``` If found, your changes are in the AAR. From 3731048d8086f246d19f627e3223b3efd5611601 Mon Sep 17 00:00:00 2001 From: Liuchuan Yu <38197351+luffy-yu@users.noreply.github.com> Date: Sun, 30 Nov 2025 22:15:58 -0500 Subject: [PATCH 6/7] Replace X.XX.X with 2.33.0 Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/source/backends-qualcomm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index efb6ae74c30..abfce62953c 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -381,7 +381,7 @@ adb logcat | grep -E "ExecuTorch" **Symptoms**: ``` -W [Qnn ExecuTorch]: Qnn API version X.XX.X is mismatched +W [Qnn ExecuTorch]: Qnn API version 2.33.0 is mismatched E [Qnn ExecuTorch]: Using newer context binary on old SDK E [Qnn ExecuTorch]: Can't create context from binary. Error 5000 ``` From 38652953aab9094e5b58fdfca91551080e0009ab Mon Sep 17 00:00:00 2001 From: Liuchuan Yu Date: Tue, 2 Dec 2025 22:47:09 -0500 Subject: [PATCH 7/7] Format with indentations --- docs/source/backends-qualcomm.md | 65 ++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 29 deletions(-) diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md index abfce62953c..b6d357aac2f 100644 --- a/docs/source/backends-qualcomm.md +++ b/docs/source/backends-qualcomm.md @@ -373,53 +373,60 @@ adb logcat | grep -E "ExecuTorch" ##### Issue 1: Error 18 (InvalidArgument) -**Cause**: Wrong parameter order in Runner constructor or missing QNN config +- **Cause**: Wrong parameter order in Runner constructor or missing QNN config -**Solution**: Check `$EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/llama/runner/runner.h` for the correct constructor signature. +- **Solution**: Check `$EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/llama/runner/runner.h` for the correct constructor signature. ##### Issue 2: Error 1 (Internal) with QNN API Version Mismatch -**Symptoms**: -``` -W [Qnn ExecuTorch]: Qnn API version 2.33.0 is mismatched -E [Qnn ExecuTorch]: Using newer context binary on old SDK -E [Qnn ExecuTorch]: Can't create context from binary. Error 5000 -``` +- **Symptoms**: -**Cause**: Model compiled with QNN SDK version X but APK uses QNN runtime version Y + ``` + W [Qnn ExecuTorch]: Qnn API version 2.33.0 is mismatched + E [Qnn ExecuTorch]: Using newer context binary on old SDK + E [Qnn ExecuTorch]: Can't create context from binary. Error 5000 + ``` -**Solution**: -1. Update `build.gradle.kts` with matching QNN runtime version +- **Cause**: Model compiled with QNN SDK version X but APK uses QNN runtime version Y -> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. -**Before**: -```kotlin -implementation("com.qualcomm.qti:qnn-runtime:2.33.0") -``` +- **Solution**: + - Update `build.gradle.kts` with matching QNN runtime version -**After**: -```kotlin -implementation("com.qualcomm.qti:qnn-runtime:2.37.0") -``` + > **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches. -2. Or recompile model with matching QNN SDK version + **Before**: + ```kotlin + implementation("com.qualcomm.qti:qnn-runtime:2.33.0") + ``` + + **After**: + ```kotlin + implementation("com.qualcomm.qti:qnn-runtime:2.37.0") + ``` + + - Or recompile model with matching QNN SDK version ##### Issue 3: Native Code Changes Not Applied -**Symptoms**: Debug logs don't appear, behavior doesn't change +- **Symptoms**: + - Debug logs don't appear + - Behavior doesn't change -**Cause**: Gradle using Maven dependency instead of local AAR +- **Cause**: + - Gradle using Maven dependency instead of local AAR -**Solution**: Always build with `-PuseLocalAar=true` flag +- **Solution**: + - Always build with `-PuseLocalAar=true` flag ##### Issue 4: Logs Not Appearing -**Cause**: Wrong logging tag filter +- **Cause**: Wrong logging tag filter -**Solution**: QNN uses "ExecuTorch" tag: -```bash -adb logcat | grep "ExecuTorch" -``` +- **Solution**: QNN uses "ExecuTorch" tag: + + ```bash + adb logcat | grep "ExecuTorch" + ``` ## Supported model list