UbiquitousLearning · chenghuaWang · Feb 3, 2026 · Feb 3, 2026 · Feb 3, 2026 · Feb 3, 2026
diff --git a/README-ZH.md b/README-ZH.md
@@ -17,8 +17,9 @@ mllm
 
 ## 最新动态
 
+- [2026 年 2 月 3 日] 🔥🔥🔥 MLLM Qnn AOT 已支持在 NPU 上全图执行！[快速开始](https://ubiquitouslearning.github.io/mllm/qnn_backend/aot_execute.html), [技术报告](https://chenghuawang.github.io/News/2026-01-29-mllm-qnn-aot-support/)
 - [2025 年 11 月 27 日] Android Demo 更新：通过一种全新的 In-App Go 服务架构，在 Android 上实现了 Qwen3 和 DeepSeek-OCR 的稳定流式推理。
-- [2025 年 11 月 23 日] 🔥🔥🔥 MLLM v2 发布！
+- [2025 年 11 月 23 日] MLLM v2 发布！
 - [2025 年 8 月 28 日] 即将停止对 MLLM V1 的支持。在弃用前，V1 将集成以下功能：GPT-OSS。随后 MLLM 将迁移至 V2（可在 V2 分支查看）。V2 将带来全新的能力：
 
   - 更加 Pythonic 的模型编写方式，支持即时执行（eager execution）

diff --git a/README.md b/README.md
@@ -17,8 +17,9 @@ mllm
 
 ## Latest News
 
+- [2026 Feb 03] 🔥🔥🔥 MLLM Qnn AOT Support for Full Graph Execution on NPU! [Quick Start](https://ubiquitouslearning.github.io/mllm/qnn_backend/aot_execute.html), [Technical Report](https://chenghuawang.github.io/News/2026-01-29-mllm-qnn-aot-support-en/)
 - [2025 Nov 27] Android Demo Update: Enabled stable Qwen3 and DeepSeek-OCR streaming on Android via a novel In-App Go Server Architecture.
-- [2025 Nov 23] 🔥🔥🔥 MLLM v2 released!
+- [2025 Nov 23] MLLM v2 released!
 - [2025 Aug 28] Support for MLLM V1 is ending soon. Before its retirement, V1 will integrate the following features: GPT-OSS. MLLM will then transition to V2, which can be viewed on the V2 branch. V2 will include brand-new capabilities:
   - A more Pythonic model authoring approach with eager execution
   - Compilation support for easier NPU integration

@@ -89,6 +89,8 @@ Taking ``qwen3_qnn_aot`` as an example, the detailed steps are as follows.
       -m /path/to/output/qwen3_1.7b.mllm \
       -c ./examples/qwen3_qnn_aot/config_1.7B.json \
       --aot_config ./examples/qwen3_qnn_aot/qnn_aot_cfg_1.7B.json
+      # Optional, default value is /opt/qcom/aistack/qairt/2.41.0.251128/lib/x86_64-linux-clang/
+      # --qnn_env_path path/to/qnn_sdk.
 
 
    This program reads the ``.mllm`` model file and the quantization recipe, and finally generates a QNN context binary file named ``qwen3-1.7B-lpbq-sha.bin``. This file contains all the information needed to execute inference on the target device.