Skip to content

aorying/ima2-gen

 
 

Repository files navigation

ima2-gen

npm version Node.js License: MIT

Read in other languages: 한국어 · 日本語 · 简体中文

ima2-gen is a local image generation studio for people who want the ChatGPT/Codex image workflow in a small desktop-like web app.

Run it with npx, sign in with Codex OAuth, type a prompt, and keep iterating with history, references, style sheets, and node branches. No OpenAI API key is required for image generation in the default path.

ima2-gen classic generation screen with prompt composer, generated image, compact model label, and result metadata.

Quick Start

npx ima2-gen serve

Then open http://localhost:3333.

If Codex is not logged in yet:

npx @openai/codex login
npx ima2-gen serve

You can also install it globally:

npm install -g ima2-gen
ima2 serve

What It Does

  • Classic mode: generate, edit, reuse the current image, paste references, and continue from history.
  • Node mode: branch a good image into multiple directions without losing the original.
  • Local gallery: keep generated assets on your machine with session-aware history.
  • Reference images: drag, drop, paste, and attach up to 5 references; large images are compressed before upload.
  • Style sheets: extract and reuse a visual direction across classic and node prompts.
  • Observable jobs: active and recent jobs are tracked with safe logs and request IDs.

OAuth Only For Image Generation

Image generation currently runs through the local Codex/ChatGPT OAuth path.

API keys may still be detected for auxiliary developer features such as billing checks or style-sheet extraction, but generation routes reject provider: "api" with APIKEY_DISABLED.

If the settings page says Configured but disabled, that means an API key exists in env/config but image generation still uses OAuth.

Settings workspace showing OAuth active and API key configured but disabled.

Model Guidance

Start with gpt-5.4 when you want the safest balanced image workflow.

  • gpt-5.4 — recommended balanced choice.
  • gpt-5.4-mini — current app default and faster draft model.
  • gpt-5.5 — strongest quality option when your Codex CLI/OAuth backend supports it. It may use more quota, expose different tool capabilities, or require updating Codex CLI before it works reliably.

The app also exposes quality (low, medium, high) and moderation (auto, low) controls.

Workflows

Classic Mode

Use Classic when you want one strong result quickly.

  1. Write a prompt.
  2. Attach or paste references if needed.
  3. Pick model, quality, size, format, and moderation.
  4. Generate, copy, download, or continue from the result.

Node Mode

Use Node mode when you want to explore branches.

Node mode with connected generated cards and compact per-node metadata.

Each node keeps its own prompt and result. Root nodes can attach local references; child nodes use the parent image as their source. Completed jobs are matched back to nodes by request ID, so reloads and graph version conflicts can recover finished results.

Settings And Style Sheets

The settings workspace keeps account, model, appearance, and language controls away from the generation sidebar.

Settings workspace with account navigation and generation model controls.

Style sheets let you capture a reusable visual direction.

Style sheet editor with medium, composition, mood, subject, palette, and negative fields.

CLI Commands

Server

Command Description
ima2 serve Start the local web server
ima2 setup Reconfigure saved auth
ima2 status Show config and OAuth status
ima2 doctor Diagnose Node, package, config, and auth
ima2 open Open the web UI
ima2 reset Remove saved config

Client

These require a running ima2 serve.

Command Description
ima2 gen <prompt> Generate from the CLI
ima2 edit <file> --prompt <text> Edit an existing image
ima2 ls List local history
ima2 show <name> Reveal a generated asset
ima2 ps List active jobs
ima2 ping Health-check the running server

The server advertises its port at ~/.ima2/server.json. Override discovery with --server <url> or IMA2_SERVER=http://localhost:3333.

Configuration

Config priority:

environment variables > ~/.ima2/config.json > built-in defaults
Variable Default Description
IMA2_PORT / PORT 3333 Web server port
IMA2_OAUTH_PROXY_PORT / OAUTH_PORT 10531 OAuth proxy port
IMA2_SERVER CLI target override
IMA2_CONFIG_DIR ~/.ima2 Config and SQLite location
IMA2_GENERATED_DIR ~/.ima2/generated Generated image directory
IMA2_NO_OAUTH_PROXY Set 1 to disable the auto-started OAuth proxy
IMA2_INFLIGHT_TERMINAL_TTL_MS 30000 Recent terminal job retention for debug views
OPENAI_API_KEY API key for supported auxiliary paths, not image generation

API Reference

The endpoint list moved to docs/API.md so this README can stay focused on first-run use.

Useful references:

Troubleshooting

ima2 ping says the server is unreachable Start ima2 serve, then check ~/.ima2/server.json. You can also run ima2 ping --server http://localhost:3333.

OAuth login does not work Run npx @openai/codex login, confirm ima2 status, then restart ima2 serve.

Images fail with APIKEY_DISABLED Use OAuth for generation. API-key image generation is intentionally disabled in this build.

A large reference image fails The app compresses large JPEG/PNG references before upload. If a file still fails, convert it to JPEG or PNG at a lower resolution and try again. HEIC/HEIF files are not supported by the browser path.

gpt-5.5 fails but other models work Update Codex CLI first, then retry. If it still fails, your account or backend route may not expose the same image capability or quota for gpt-5.5 yet; use gpt-5.4 as the stable fallback.

The port is unexpectedly 3457 Your shell may have inherited PORT=3457 from another local tool. Run unset PORT or start with IMA2_PORT=3333 ima2 serve.

Development

git clone https://github.com/lidge-jun/ima2-gen.git
cd ima2-gen
npm install
npm run dev
npm test
npm run build

npm run dev builds the UI and starts server.js with --watch. Node mode is part of the packaged UI by default.

License

MIT

About

Minimal CLI + web UI for OpenAI GPT Image 2 generation. Dual auth: API Key (paid) or OAuth via ChatGPT (free). Text-to-image, image-to-image, parallel gen, custom sizes.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • JavaScript 47.9%
  • TypeScript 40.4%
  • CSS 9.9%
  • Shell 1.5%
  • HTML 0.3%