Add Run Android LlamaDemo with QNN backend #16011

luffy-yu · 2025-12-01T03:07:58Z

Motivation

The current Android LlamaDemo only supports XNNPACK.
The QNN-backend Android Demo is missing, the qnn_llama_runner approach is provided though.

Summary

Add Guidance for building and running Android LlamaDemo with QNN-Backend
It's verified on Samsung S23 (SoC SM8550).

Test plan

This PR has been tested on LlamaDemo-Executorch-QNN.

The APK file and the pre-built model can be found in its README.

This PR primarily borrows doc from QNN_ANDROID_FIX_SUMMARY.

pytorch-bot · 2025-12-01T03:08:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16011

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2025-12-01T03:08:03Z

Hi @luffy-yu!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copilot

Pull request overview

This PR adds comprehensive documentation for building and running the Android LlamaDemo application with Qualcomm's QNN backend support. Previously, the Android demo only supported XNNPACK, and this guide fills that gap by providing step-by-step instructions verified on Samsung S23 hardware.

Key changes:

Added complete build and deployment guide for QNN-enabled Android LlamaDemo
Included verification steps to confirm correct AAR and APK configuration
Documented common issues and their solutions for QNN Android deployment

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/source/backends-qualcomm.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/source/backends-qualcomm.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-01T03:13:22Z

docs/source/backends-qualcomm.md

+### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend
+


The external link points to 'meta-pytorch/executorch-examples' but the PR is in the 'pytorch/executorch' repository. Verify this is the correct repository reference for the LlamaDemo, as it may confuse users if the link is incorrect or if the demo has moved.

Suggested change

### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend

### Run [Android LlamaDemo](https://github.com/pytorch/executorch/tree/main/examples/llm/android/LlamaDemo) with QNN backend

> **Note:** If the LlamaDemo is not present in the `pytorch/executorch` repository, you can find it in the [meta-pytorch/executorch-examples](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) repository.

Copilot · 2025-12-01T03:13:22Z

docs/source/backends-qualcomm.md

+# Build the AAR
+cd $EXECUTORCH_ROOT
+export BUILD_AAR_DIR=$EXECUTORCH_ROOT/aar-out
+./scripts/build_android_library.sh


The script invocation lacks context about required environment variables or prerequisites. Consider adding a note about any required environment setup (e.g., Android NDK path, QNN SDK path) before running this script, or reference existing documentation that covers these prerequisites.

Copilot · 2025-12-01T03:13:23Z

docs/source/backends-qualcomm.md

+adb push model.pte /data/local/tmp/llama
+adb push tokenizer.bin /data/local/tmp/llama


The documentation doesn't specify where to obtain 'model.pte' and 'tokenizer.bin'. Add a reference to instructions for generating or downloading these required files, especially since the QNN backend requires specific model compilation steps.

Copilot · 2025-12-01T03:13:23Z

docs/source/backends-qualcomm.md

+# Check for debug strings in the AAR
+unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \
+  strings | grep "YOUR DEBUG INFO"


The placeholder 'YOUR DEBUG INFO' is ambiguous and doesn't provide actionable guidance. Either provide a concrete example of what debug string to search for, or explain what type of debug information users should add to their code for verification purposes.

Suggested change

# Check for debug strings in the AAR

unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \

strings | grep "YOUR DEBUG INFO"

# Check for your custom debug string in the AAR.

# For example, if you added a debug string like "MyCustomDebugString" to your code,

unzip -p $DEMO_APP/app/libs/executorch.aar jni/arm64-v8a/libexecutorch.so | \

strings | grep "MyCustomDebugString"

# Replace "MyCustomDebugString" with the actual debug string you added to your code.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-01T03:14:36Z

docs/source/backends-qualcomm.md

+
+**Solution**: 
+1. Update `build.gradle.kts` with matching QNN runtime version
+


The note appears after the 'Before' label but should appear before showing any code examples. Moving it above line 395 would make the documentation flow more clearly and prevent confusion about which versions to use.

Suggested change

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-01T03:15:16Z

docs/source/backends-qualcomm.md

+unzip -l app/build/outputs/apk/debug/app-debug.apk | grep "libQnnHtp.so"
+```
+
+Expected size for QNN 2.37.0: ~2,465,440 bytes


[nitpick] The expected file size is overly specific and may vary slightly between builds or minor version updates. Consider changing this to a range (e.g., '~2.4-2.5 MB') to make the verification more resilient to minor variations.

Suggested change

Expected size for QNN 2.37.0: ~2,465,440 bytes

Expected size for QNN 2.37.0: ~2.4–2.5 MB (2,400,000–2,500,000 bytes)

Copilot · 2025-12-01T03:15:16Z

docs/source/backends-qualcomm.md

+**Solution**: 
+1. Update `build.gradle.kts` with matching QNN runtime version
+
+> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.


[nitpick] The note about version numbers being examples appears after mentioning 'version X' and 'version Y' in the previous text but before showing the actual version numbers. This note would be more effective if placed immediately after the code examples (after line 403) where the specific versions are shown, or reference should be made to it in the text above where X and Y are mentioned.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-01T03:16:28Z

docs/source/backends-qualcomm.md

+> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.
+**Before**:
+```kotlin
+implementation("com.qualcomm.qti:qnn-runtime:2.33.0")
+```
+
+**After**:
+```kotlin
+implementation("com.qualcomm.qti:qnn-runtime:2.37.0")
+```


The numbered list item '1.' is followed by a note and code examples, but there's no item '2.' at the same indentation level until line 405. The note and code examples should be indented or formatted as sub-items of item 1 to maintain proper list structure.

Suggested change

> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.

**Before**:

```kotlin

implementation("com.qualcomm.qti:qnn-runtime:2.33.0")

```

**After**:

```kotlin

implementation("com.qualcomm.qti:qnn-runtime:2.37.0")

```

> **Note:** The version numbers below (`2.33.0` and `2.37.0`) are examples only. Please check for the latest compatible QNN runtime version or match your QNN SDK version to avoid API mismatches.

**Before**:

```kotlin

implementation("com.qualcomm.qti:qnn-runtime:2.33.0")

```

**After**:

```kotlin

implementation("com.qualcomm.qti:qnn-runtime:2.37.0")

```

Add Run Android LlamaDemo with QNN backend

29f233a

Copilot AI review requested due to automatic review settings December 1, 2025 03:07

luffy-yu requested a review from mergennachin as a code owner December 1, 2025 03:07

Copilot AI reviewed Dec 1, 2025

View reviewed changes

docs/source/backends-qualcomm.md Outdated Show resolved Hide resolved

docs/source/backends-qualcomm.md Outdated Show resolved Hide resolved

docs/source/backends-qualcomm.md Show resolved Hide resolved

Fix ***Step 6***: Run the Llama Demo

2c39ea1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 1, 2025 03:11

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 1, 2025

Copilot AI reviewed Dec 1, 2025

View reviewed changes

docs/source/backends-qualcomm.md Outdated Show resolved Hide resolved

docs/source/backends-qualcomm.md Outdated Show resolved Hide resolved

Update $DEMO_APP definition

5ba7196

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 1, 2025 03:12

Copilot AI reviewed Dec 1, 2025

View reviewed changes

Update version numbers example

c67fca4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 1, 2025 03:14

Copilot AI reviewed Dec 1, 2025

View reviewed changes

Update grep "QNN"

975c371

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 1, 2025 03:14

Copilot AI reviewed Dec 1, 2025

View reviewed changes

Replace X.XX.X with 2.33.0

3731048

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings December 1, 2025 03:16

Copilot AI reviewed Dec 1, 2025

View reviewed changes

luffy-yu mentioned this pull request Dec 1, 2025

Executorch with QNN AI Engine Backend #15387

Open

cccclai requested review from chenweng-quic, haowhsu-quic, shewu-quic and winskuo-quic December 1, 2025 05:44

		### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend

-### Run [Android LlamaDemo](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) with QNN backend
+### Run [Android LlamaDemo](https://github.com/pytorch/executorch/tree/main/examples/llm/android/LlamaDemo) with QNN backend
+> **Note:** If the LlamaDemo is not present in the `pytorch/executorch` repository, you can find it in the [meta-pytorch/executorch-examples](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo) repository.

		adb push model.pte /data/local/tmp/llama
		adb push tokenizer.bin /data/local/tmp/llama


		Solution:
		1. Update `build.gradle.kts` with matching QNN runtime version

	Expected size for QNN 2.37.0: ~2,465,440 bytes
	Expected size for QNN 2.37.0: ~2.4–2.5 MB (2,400,000–2,500,000 bytes)

Add Run Android LlamaDemo with QNN backend #16011

Are you sure you want to change the base?

Add Run Android LlamaDemo with QNN backend #16011

Conversation

luffy-yu commented Dec 1, 2025

Motivation

Summary

Test plan

Uh oh!

pytorch-bot bot commented Dec 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16011

❗ 2 Active SEVs

Uh oh!

meta-cla bot commented Dec 1, 2025

Action Required

Process

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant