Add Run Android LlamaDemo with QNN backend

luffy-yu · luffy-yu · commit 29f233a4f379 · 2025-11-30T21:59:40.000-05:00
diff --git a/docs/source/backends-qualcomm.md b/docs/source/backends-qualcomm.md
@@ -288,6 +288,137 @@ After the above command, pre-processed inputs and outputs are put in `$EXECUTORC
 The command-line arguments are written in [utils.py](https://github.com/pytorch/executorch/blob/main/examples/qualcomm/utils.py#L139).
 The model, inputs, and output location are passed to `qnn_executorch_runner` by `--model_path`, `--input_list_path`, and `--output_folder_path`.
 
+### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend
+
+`$DEMO_APP` refers to the root of executorch android demo root, i.e., the directory containing `build.gradle.kts`.
+
+***Step 1***: Rebuild ExecuTorch AAR
+
+```bash
+# Build the AAR
+cd $EXECUTORCH_ROOT
+export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out
+./scripts/build_android_library.sh
+```
+
+***Step 2***: Copy AAR to Android Project
+
+```bash
+cp $EXECUTORCH_ROOT/aar-out/executorch.aar \
+   $DEMO_APP/app/libs/executorch.aar
+```
+
+***Step 3***: Build Android APK
+
+```bash
+cd $DEMO_APP
+./gradlew clean assembleDebug -PuseLocalAar=true
+```
+
+***Step 4***: Install on Device
+
+```bash
+adb install -r app/build/outputs/apk/debug/app-debug.apk
+```
+
+***Step 5***: Push model
+
+```bash
+adb shell mkdir -p /data/local/tmp/llama
+adb push model.pte /data/local/tmp/llama
+adb push tokenizer.bin /data/local/tmp/llama
+```
+
+***Step 5***: Run the Llama Demo
+
+- Open the App on Android
+- Select `QUALCOMM` backend
+- Select `model.pte` Model
+- Select `tokenizer.bin` Tokenizer
+- Select Model Type
+- Click LOAD MODEL
+- It should show `Successfully loaded model.`
+
+
+#### Verification Steps
+
+***Step 1***. Verify AAR Contains Your Changes
+
+```bash
+# Check for debug strings in the AAR
+unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
+  strings | grep "YOUR DEBUG INFO"
+```
+
+If found, your changes are in the AAR.
+
+***Step 2***. Verify APK Contains Correct Libraries
+
+```bash
+# Check QNN library version in APK
+cd $DEMO_APP
+unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so"
+```
+
+Expected size for QNN 2.37.0: ~2,465,440 bytes
+
+***Step 3***. Monitor Logs During Model Loading
+
+```bash
+adb logcat -c
+adb logcat | grep -E "ExecuTorch"
+```
+
+#### Common Issues and Solutions
+
+##### Issue 1: Error 18 (InvalidArgument)
+
+**Cause**: Wrong parameter order in Runner constructor or missing QNN config
+
+**Solution**: Check `$EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/llama/runner/runner.h` for the correct constructor signature.
+
+##### Issue 2: Error 1 (Internal) with QNN API Version Mismatch
+
+**Symptoms**:
+```
+W [Qnn ExecuTorch]: Qnn API version X.XX.X is mismatched
+E [Qnn ExecuTorch]: Using newer context binary on old SDK
+E [Qnn ExecuTorch]: Can't create context from binary. Error 5000
+```
+
+**Cause**: Model compiled with QNN SDK version X but APK uses QNN runtime version Y
+
+**Solution**: 
+1. Update `build.gradle.kts` with matching QNN runtime version
+
+**Before**:
+```kotlin
+implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
+```
+
+**After**:
+```kotlin
+implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
+```
+
+2. Or recompile model with matching QNN SDK version
+
+##### Issue 3: Native Code Changes Not Applied
+
+**Symptoms**: Debug logs don't appear, behavior doesn't change
+
+**Cause**: Gradle using Maven dependency instead of local AAR
+
+**Solution**: Always build with `-PuseLocalAar=true` flag
+
+##### Issue 4: Logs Not Appearing
+
+**Cause**: Wrong logging tag filter
+
+**Solution**: QNN uses "ExecuTorch" tag:
+```bash
+adb logcat | grep "ExecuTorch"
+```
 
 ## Supported model list