ImaGen

Personal note from the project owner

I am not a software developer and I do not know programming languages professionally. This project has been built purely through vibe coding, iterative testing, and continuous refinement with AI assistance.

ImaGen is an open-source Android app for fully on-device image generation using Stable Diffusion 1.5 with Qualcomm QNN NPU acceleration.

No cloud. No accounts. No data leaves the device.

Download

Get the latest signed APK from Releases.

Project Status

Area	Status
SD 1.5 / QNN generation	✅ Stable
Face Refine pipeline	✅ Stable
Highres Fix / Upscale	✅ Stable
Hilt DI + Architecture	✅ Active (KSP 2.3.6 + Hilt 2.56)
CI / Release workflow	✅ Active (instrumented tests, artifact reuse)
SDXL-Turbo on-device	🟡 Research — 8/8 QNN graphs validated on SM8550

Features

txt2img — generate images from text prompts with prompt weights (word:1.5)
img2img — refine existing images with denoise control
Inpainting — paint a mask and regenerate selected regions
Upscaling x2/x4 — ONNX-based post-generation upscale
Face Precision — automatic face detection + SD refinement with quality gate (Adetailer-style)
Error recovery — categorized failure causes with actionable suggestions
Highres Fix — second-pass img2img at 512×512 for improved detail
Live preview — SSE streaming of intermediate steps during generation
20 QNN models — built-in catalog from xororz/sd-qnn
Fast mode — reduced-step preset for quick iterations
Gallery — persistent history with metadata and search
Scheduler selection — DPM++ 2M Karras, Euler Ancestral
Infinite pager — swipe for variations with random seed
Batch generation — queue multiple generations
16 languages — English, Spanish, Portuguese, German, French, Russian, Italian, Turkish, Polish, Arabic, Japanese, Indonesian, Korean, Persian, Hebrew, Ukrainian

Supported Devices

Chipset	NPU Backend
Snapdragon 8 Gen 1 (SM8450)	QNN HTP
Snapdragon 8 Gen 2 (SM8550)	QNN HTP
Snapdragon 8 Gen 3 (SM8650)	QNN HTP
Snapdragon 8 Elite (SM8750)	QNN HTP

MNN (CPU/GPU) backend is available in code for non-Qualcomm devices.

Architecture

flowchart TB
    UI["Compose UI\nScreen + Form + Carousel + Gallery"]
    VM["ViewModel\nImageGeneratorViewModel"]
    ORCH["ImageGenerationOrchestrator"]
    SD["StableDiffusionHelper\n(thin facade)"]
    HTTP["SdHttpClient\nHTTP transport"]
    REG["SdModelRegistry\nModel discovery"]
    UP["UpscaleOrchestrator\nONNX Runtime"]
    FACE["FaceRestorationEngine\nFaceDetector + FacePatchProcessor"]
    SVC["SDBackendService"]
    PM["SdProcessManager\nProcess lifecycle"]
    LOC["SdModelLocator"]
    NATIVE["libstable_diffusion_core.so\nQNN / MNN"]

    UI --> VM
    VM --> ORCH
    VM --> UP
    VM --> FACE
    ORCH --> SD
    SD --> HTTP
    SD --> REG
    HTTP -->|"HTTP :8081"| SVC
    SVC --> PM
    SVC --> LOC
    PM -->|ProcessBuilder| NATIVE

Tech Stack

Language: Kotlin · JVM 11
UI: Jetpack Compose · Material 3
DI: Hilt 2.56 + KSP 2.3.6
Build: Gradle KTS · compileSdk 36 · minSdk 27
DI: Hilt (Dagger) + KSP
Backend: local-dream native process
Acceleration: Qualcomm QNN (NPU) · MNN (CPU/GPU)
Upscaling: ONNX Runtime
Networking: OkHttp 5.x (localhost only)
CI: GitHub Actions (build + detekt + unit tests + instrumented tests + Jacoco coverage gate + APK size tracking)

SDXL Research

The research/ directory contains an active investigation to bring SDXL-Turbo (1024×1024) to on-device NPU inference. Key milestones:

✅ ONNX export of all 4 SDXL-Turbo components (~12.9 GB)
✅ QNN conversion with quantization (13 GB → 5.7 GB)
✅ 8/8 QNN context binaries validated on-device (Samsung S23 / SM8550)
✅ 6-graph UNet split to work within HTP firmware limits (max 2 GB/buffer)
✅ C++ inference engine scaffolding complete (6 components, host tests passing)
🔲 End-to-end on-device test pending

See research/docs/STATE.md for full details.

Reproducible QNN Conversion Pipeline

The research/ directory is a reproducible knowledge base for converting any SDXL-class model to on-device QNN inference on Android — including documented fixes for SDK bugs that are not described in any Qualcomm issue tracker, StackOverflow answer, or public paper.

If you are trying to run diffusion models on a Qualcomm NPU, these are the obstacles you will hit:

Bug / Constraint	Symptom	Fix in this repo
`qnn-onnx-converter` Split op → `split_index` overflow (v2.32, v2.39)	Conversion fails with large negative index values (`-78M`, `-516M`)	`scripts/sdxl-qnn/fix_split_to_slice.py` — rewrites all Split → Slice sequences
NumPy 2.x ABI break with `libPyIrGraph.so`	Silent output corruption — no crash, no warning, just wrong values	Pin `numpy<2` in the converter venv
LayerNorm `src_op.attribute` read in SDK v2.41	Obscure converter error on text encoder graphs	Direct attribute read; skip the helper abstraction
23 GroupNorm Reshape issues in VAE decoder	Distorted images or runtime shape errors	Reshape fix script with correct shapes
HTP firmware Alloc2 limit: max 2 GB per contiguous buffer	UNet context binary rejected on-device with `socModel: 0`, no explanation	6-graph UNet split — no single graph exceeds 1.56 GB
VTCM 8 MB < skip tensors 10 MB in `unet_d`	Silent quality degradation or execution failure	`unet_d` recomputes `conv_in + down_blocks_0` internally

Artifacts:

research/docs/SDXL-QNN.ipynb — Colab notebook: ONNX export → QNN IR → context binaries → on-device validation
research/docs/COLAB_SESSION_PROMPT.md — reproducible session bootstrap (env, deps, paths)
scripts/sdxl-qnn/ — 10+ automation scripts for each conversion step
research/sdxl_engine/ — C++ inference engine (6 components: Engine, Scheduler, TextEncoder, Tokenizer, UNetRunner, VaeDecoder) with host tests passing
research/docs/STATE.md — 15 documented sessions with root cause and fix for each blocker

This documentation is independent of the ImaGen app — if you are working on any QNN diffusion pipeline, the bugs and fixes are directly reusable.

Build from Source

git clone https://github.com/Zyach/ImaGen.git
cd ImaGen
./gradlew assembleRelease

Requirements: Android SDK, JDK 21+

Optional local.properties for gated HuggingFace models:

HF_TOKEN=hf_your_token_here

Privacy

All inference runs locally on the device
No account required
No telemetry or analytics
Network access is only used for model downloads from HuggingFace

Documentation

Document	Purpose
`docs/STATE.md`	Current project state and priorities
`docs/ROADMAP.md`	Strategy, tracks, and milestones
`docs/backlog-operativo.md`	Actionable task backlog
`research/docs/STATE.md`	SDXL QNN research state
`CHANGELOG.md`	Release history
`docs/INFORME_CONSULTORIA_20260411.md`	Technical audit report
`CONTRIBUTING.md`	Contribution guidelines

Contributing

Contributions are welcome! See CONTRIBUTING.md for build instructions, architecture overview, code conventions, and PR guidelines.

Please keep changes aligned with:

Image generation only — no chat, RAG, TTS, or unrelated features
On-device workflows
Android product quality
Maintainability over feature sprawl

Acknowledgments

local-dream — native SD backend
Qualcomm AI Stack — QNN runtime
MNN — mobile inference framework
ONNX Runtime — upscaling inference
UltraFace — face detection

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 292 Commits
.github		.github
app		app
assets/screenshots		assets/screenshots
config/detekt		config/detekt
docs		docs
gradle		gradle
nexa_npu_pack		nexa_npu_pack
qnn_pack		qnn_pack
research		research
scripts		scripts
sd_pack		sd_pack
signing		signing
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImaGen

Download

Project Status

Features

Supported Devices

Architecture

Tech Stack

SDXL Research

Reproducible QNN Conversion Pipeline

Build from Source

Privacy

Documentation

Contributing

Acknowledgments

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ImaGen

Download

Project Status

Features

Supported Devices

Architecture

Tech Stack

SDXL Research

Reproducible QNN Conversion Pipeline

Build from Source

Privacy

Documentation

Contributing

Acknowledgments

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages