Skip to content

Zyach/ImaGen

Repository files navigation

ImaGen

GitHub Release Android Build License: MIT Android

Personal note from the project owner

I am not a software developer and I do not know programming languages professionally. This project has been built purely through vibe coding, iterative testing, and continuous refinement with AI assistance.

ImaGen is an open-source Android app for fully on-device image generation using Stable Diffusion 1.5 with Qualcomm QNN NPU acceleration.

No cloud. No accounts. No data leaves the device.

Download

Get the latest signed APK from Releases.

Project Status

Area Status
SD 1.5 / QNN generation ✅ Stable
Face Refine pipeline ✅ Stable
Highres Fix / Upscale ✅ Stable
Hilt DI + Architecture ✅ Active (KSP 2.3.6 + Hilt 2.56)
CI / Release workflow ✅ Active (instrumented tests, artifact reuse)
SDXL-Turbo on-device 🟡 Research — 8/8 QNN graphs validated on SM8550

Features

  • txt2img — generate images from text prompts with prompt weights (word:1.5)
  • img2img — refine existing images with denoise control
  • Inpainting — paint a mask and regenerate selected regions
  • Upscaling x2/x4 — ONNX-based post-generation upscale
  • Face Precision — automatic face detection + SD refinement with quality gate (Adetailer-style)
  • Error recovery — categorized failure causes with actionable suggestions
  • Highres Fix — second-pass img2img at 512×512 for improved detail
  • Live preview — SSE streaming of intermediate steps during generation
  • 20 QNN models — built-in catalog from xororz/sd-qnn
  • Fast mode — reduced-step preset for quick iterations
  • Gallery — persistent history with metadata and search
  • Scheduler selection — DPM++ 2M Karras, Euler Ancestral
  • Infinite pager — swipe for variations with random seed
  • Batch generation — queue multiple generations
  • 16 languages — English, Spanish, Portuguese, German, French, Russian, Italian, Turkish, Polish, Arabic, Japanese, Indonesian, Korean, Persian, Hebrew, Ukrainian

Supported Devices

Chipset NPU Backend
Snapdragon 8 Gen 1 (SM8450) QNN HTP
Snapdragon 8 Gen 2 (SM8550) QNN HTP
Snapdragon 8 Gen 3 (SM8650) QNN HTP
Snapdragon 8 Elite (SM8750) QNN HTP

MNN (CPU/GPU) backend is available in code for non-Qualcomm devices.

Architecture

flowchart TB
    UI["Compose UI\nScreen + Form + Carousel + Gallery"]
    VM["ViewModel\nImageGeneratorViewModel"]
    ORCH["ImageGenerationOrchestrator"]
    SD["StableDiffusionHelper\n(thin facade)"]
    HTTP["SdHttpClient\nHTTP transport"]
    REG["SdModelRegistry\nModel discovery"]
    UP["UpscaleOrchestrator\nONNX Runtime"]
    FACE["FaceRestorationEngine\nFaceDetector + FacePatchProcessor"]
    SVC["SDBackendService"]
    PM["SdProcessManager\nProcess lifecycle"]
    LOC["SdModelLocator"]
    NATIVE["libstable_diffusion_core.so\nQNN / MNN"]

    UI --> VM
    VM --> ORCH
    VM --> UP
    VM --> FACE
    ORCH --> SD
    SD --> HTTP
    SD --> REG
    HTTP -->|"HTTP :8081"| SVC
    SVC --> PM
    SVC --> LOC
    PM -->|ProcessBuilder| NATIVE
Loading

Tech Stack

  • Language: Kotlin · JVM 11
  • UI: Jetpack Compose · Material 3
  • DI: Hilt 2.56 + KSP 2.3.6
  • Build: Gradle KTS · compileSdk 36 · minSdk 27
  • DI: Hilt (Dagger) + KSP
  • Backend: local-dream native process
  • Acceleration: Qualcomm QNN (NPU) · MNN (CPU/GPU)
  • Upscaling: ONNX Runtime
  • Networking: OkHttp 5.x (localhost only)
  • CI: GitHub Actions (build + detekt + unit tests + instrumented tests + Jacoco coverage gate + APK size tracking)

SDXL Research

The research/ directory contains an active investigation to bring SDXL-Turbo (1024×1024) to on-device NPU inference. Key milestones:

  • ✅ ONNX export of all 4 SDXL-Turbo components (~12.9 GB)
  • ✅ QNN conversion with quantization (13 GB → 5.7 GB)
  • ✅ 8/8 QNN context binaries validated on-device (Samsung S23 / SM8550)
  • ✅ 6-graph UNet split to work within HTP firmware limits (max 2 GB/buffer)
  • ✅ C++ inference engine scaffolding complete (6 components, host tests passing)
  • 🔲 End-to-end on-device test pending

See research/docs/STATE.md for full details.

Reproducible QNN Conversion Pipeline

The research/ directory is a reproducible knowledge base for converting any SDXL-class model to on-device QNN inference on Android — including documented fixes for SDK bugs that are not described in any Qualcomm issue tracker, StackOverflow answer, or public paper.

If you are trying to run diffusion models on a Qualcomm NPU, these are the obstacles you will hit:

Bug / Constraint Symptom Fix in this repo
qnn-onnx-converter Split op → split_index overflow (v2.32, v2.39) Conversion fails with large negative index values (-78M, -516M) scripts/sdxl-qnn/fix_split_to_slice.py — rewrites all Split → Slice sequences
NumPy 2.x ABI break with libPyIrGraph.so Silent output corruption — no crash, no warning, just wrong values Pin numpy<2 in the converter venv
LayerNorm src_op.attribute read in SDK v2.41 Obscure converter error on text encoder graphs Direct attribute read; skip the helper abstraction
23 GroupNorm Reshape issues in VAE decoder Distorted images or runtime shape errors Reshape fix script with correct shapes
HTP firmware Alloc2 limit: max 2 GB per contiguous buffer UNet context binary rejected on-device with socModel: 0, no explanation 6-graph UNet split — no single graph exceeds 1.56 GB
VTCM 8 MB < skip tensors 10 MB in unet_d Silent quality degradation or execution failure unet_d recomputes conv_in + down_blocks_0 internally

Artifacts:

This documentation is independent of the ImaGen app — if you are working on any QNN diffusion pipeline, the bugs and fixes are directly reusable.

Build from Source

git clone https://github.com/Zyach/ImaGen.git
cd ImaGen
./gradlew assembleRelease

Requirements: Android SDK, JDK 21+

Optional local.properties for gated HuggingFace models:

HF_TOKEN=hf_your_token_here

Privacy

  • All inference runs locally on the device
  • No account required
  • No telemetry or analytics
  • Network access is only used for model downloads from HuggingFace

Documentation

Document Purpose
docs/STATE.md Current project state and priorities
docs/ROADMAP.md Strategy, tracks, and milestones
docs/backlog-operativo.md Actionable task backlog
research/docs/STATE.md SDXL QNN research state
CHANGELOG.md Release history
docs/INFORME_CONSULTORIA_20260411.md Technical audit report
CONTRIBUTING.md Contribution guidelines

Contributing

Contributions are welcome! See CONTRIBUTING.md for build instructions, architecture overview, code conventions, and PR guidelines.

Please keep changes aligned with:

  • Image generation only — no chat, RAG, TTS, or unrelated features
  • On-device workflows
  • Android product quality
  • Maintainability over feature sprawl

Acknowledgments

License

MIT. See LICENSE.

About

AI image generation Android app — on-device diffusion models via LiteRT

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors