Personal note from the project owner
I am not a software developer and I do not know programming languages professionally. This project has been built purely through vibe coding, iterative testing, and continuous refinement with AI assistance.
ImaGen is an open-source Android app for fully on-device image generation using Stable Diffusion 1.5 with Qualcomm QNN NPU acceleration.
No cloud. No accounts. No data leaves the device.
Get the latest signed APK from Releases.
| Area | Status |
|---|---|
| SD 1.5 / QNN generation | ✅ Stable |
| Face Refine pipeline | ✅ Stable |
| Highres Fix / Upscale | ✅ Stable |
| Hilt DI + Architecture | ✅ Active (KSP 2.3.6 + Hilt 2.56) |
| CI / Release workflow | ✅ Active (instrumented tests, artifact reuse) |
| SDXL-Turbo on-device | 🟡 Research — 8/8 QNN graphs validated on SM8550 |
- txt2img — generate images from text prompts with prompt weights
(word:1.5) - img2img — refine existing images with denoise control
- Inpainting — paint a mask and regenerate selected regions
- Upscaling x2/x4 — ONNX-based post-generation upscale
- Face Precision — automatic face detection + SD refinement with quality gate (Adetailer-style)
- Error recovery — categorized failure causes with actionable suggestions
- Highres Fix — second-pass img2img at 512×512 for improved detail
- Live preview — SSE streaming of intermediate steps during generation
- 20 QNN models — built-in catalog from xororz/sd-qnn
- Fast mode — reduced-step preset for quick iterations
- Gallery — persistent history with metadata and search
- Scheduler selection — DPM++ 2M Karras, Euler Ancestral
- Infinite pager — swipe for variations with random seed
- Batch generation — queue multiple generations
- 16 languages — English, Spanish, Portuguese, German, French, Russian, Italian, Turkish, Polish, Arabic, Japanese, Indonesian, Korean, Persian, Hebrew, Ukrainian
| Chipset | NPU Backend |
|---|---|
| Snapdragon 8 Gen 1 (SM8450) | QNN HTP |
| Snapdragon 8 Gen 2 (SM8550) | QNN HTP |
| Snapdragon 8 Gen 3 (SM8650) | QNN HTP |
| Snapdragon 8 Elite (SM8750) | QNN HTP |
MNN (CPU/GPU) backend is available in code for non-Qualcomm devices.
flowchart TB
UI["Compose UI\nScreen + Form + Carousel + Gallery"]
VM["ViewModel\nImageGeneratorViewModel"]
ORCH["ImageGenerationOrchestrator"]
SD["StableDiffusionHelper\n(thin facade)"]
HTTP["SdHttpClient\nHTTP transport"]
REG["SdModelRegistry\nModel discovery"]
UP["UpscaleOrchestrator\nONNX Runtime"]
FACE["FaceRestorationEngine\nFaceDetector + FacePatchProcessor"]
SVC["SDBackendService"]
PM["SdProcessManager\nProcess lifecycle"]
LOC["SdModelLocator"]
NATIVE["libstable_diffusion_core.so\nQNN / MNN"]
UI --> VM
VM --> ORCH
VM --> UP
VM --> FACE
ORCH --> SD
SD --> HTTP
SD --> REG
HTTP -->|"HTTP :8081"| SVC
SVC --> PM
SVC --> LOC
PM -->|ProcessBuilder| NATIVE
- Language: Kotlin · JVM 11
- UI: Jetpack Compose · Material 3
- DI: Hilt 2.56 + KSP 2.3.6
- Build: Gradle KTS · compileSdk 36 · minSdk 27
- DI: Hilt (Dagger) + KSP
- Backend: local-dream native process
- Acceleration: Qualcomm QNN (NPU) · MNN (CPU/GPU)
- Upscaling: ONNX Runtime
- Networking: OkHttp 5.x (localhost only)
- CI: GitHub Actions (build + detekt + unit tests + instrumented tests + Jacoco coverage gate + APK size tracking)
The research/ directory contains an active investigation to bring SDXL-Turbo (1024×1024) to on-device NPU inference. Key milestones:
- ✅ ONNX export of all 4 SDXL-Turbo components (~12.9 GB)
- ✅ QNN conversion with quantization (13 GB → 5.7 GB)
- ✅ 8/8 QNN context binaries validated on-device (Samsung S23 / SM8550)
- ✅ 6-graph UNet split to work within HTP firmware limits (max 2 GB/buffer)
- ✅ C++ inference engine scaffolding complete (6 components, host tests passing)
- 🔲 End-to-end on-device test pending
See research/docs/STATE.md for full details.
The research/ directory is a reproducible knowledge base for converting any SDXL-class model to on-device QNN inference on Android — including documented fixes for SDK bugs that are not described in any Qualcomm issue tracker, StackOverflow answer, or public paper.
If you are trying to run diffusion models on a Qualcomm NPU, these are the obstacles you will hit:
| Bug / Constraint | Symptom | Fix in this repo |
|---|---|---|
qnn-onnx-converter Split op → split_index overflow (v2.32, v2.39) |
Conversion fails with large negative index values (-78M, -516M) |
scripts/sdxl-qnn/fix_split_to_slice.py — rewrites all Split → Slice sequences |
NumPy 2.x ABI break with libPyIrGraph.so |
Silent output corruption — no crash, no warning, just wrong values | Pin numpy<2 in the converter venv |
LayerNorm src_op.attribute read in SDK v2.41 |
Obscure converter error on text encoder graphs | Direct attribute read; skip the helper abstraction |
| 23 GroupNorm Reshape issues in VAE decoder | Distorted images or runtime shape errors | Reshape fix script with correct shapes |
| HTP firmware Alloc2 limit: max 2 GB per contiguous buffer | UNet context binary rejected on-device with socModel: 0, no explanation |
6-graph UNet split — no single graph exceeds 1.56 GB |
VTCM 8 MB < skip tensors 10 MB in unet_d |
Silent quality degradation or execution failure | unet_d recomputes conv_in + down_blocks_0 internally |
Artifacts:
research/docs/SDXL-QNN.ipynb— Colab notebook: ONNX export → QNN IR → context binaries → on-device validationresearch/docs/COLAB_SESSION_PROMPT.md— reproducible session bootstrap (env, deps, paths)scripts/sdxl-qnn/— 10+ automation scripts for each conversion stepresearch/sdxl_engine/— C++ inference engine (6 components: Engine, Scheduler, TextEncoder, Tokenizer, UNetRunner, VaeDecoder) with host tests passingresearch/docs/STATE.md— 15 documented sessions with root cause and fix for each blocker
This documentation is independent of the ImaGen app — if you are working on any QNN diffusion pipeline, the bugs and fixes are directly reusable.
git clone https://github.com/Zyach/ImaGen.git
cd ImaGen
./gradlew assembleReleaseRequirements: Android SDK, JDK 21+
Optional local.properties for gated HuggingFace models:
HF_TOKEN=hf_your_token_here- All inference runs locally on the device
- No account required
- No telemetry or analytics
- Network access is only used for model downloads from HuggingFace
| Document | Purpose |
|---|---|
docs/STATE.md |
Current project state and priorities |
docs/ROADMAP.md |
Strategy, tracks, and milestones |
docs/backlog-operativo.md |
Actionable task backlog |
research/docs/STATE.md |
SDXL QNN research state |
CHANGELOG.md |
Release history |
docs/INFORME_CONSULTORIA_20260411.md |
Technical audit report |
CONTRIBUTING.md |
Contribution guidelines |
Contributions are welcome! See CONTRIBUTING.md for build instructions, architecture overview, code conventions, and PR guidelines.
Please keep changes aligned with:
- Image generation only — no chat, RAG, TTS, or unrelated features
- On-device workflows
- Android product quality
- Maintainability over feature sprawl
- local-dream — native SD backend
- Qualcomm AI Stack — QNN runtime
- MNN — mobile inference framework
- ONNX Runtime — upscaling inference
- UltraFace — face detection
MIT. See LICENSE.