itsablabla · devin-ai-integration · Apr 19, 2026
diff --git a/services/chatmock-shim/INSTALL-PLAN.md b/services/chatmock-shim/INSTALL-PLAN.md
@@ -0,0 +1,187 @@
+# Install plan — `openai-oauth` + `ChatMock` on `root@2.24.201.210` (Docker)
+
+**Target host:** `srv1589219.hstgr.cloud` / `2.24.201.210`
+**State today:** Ubuntu 24.04.4, Docker 29.4.0 + Compose v5.1.2, only `traefik:latest` running (host network, HTTP→HTTPS redirect, Let's Encrypt HTTP-01 resolver named `letsencrypt`, `--providers.docker.exposedbydefault=false`). Ports 22/80/443 in use, 7.8 GB RAM, 83 GB free.
+
+---
+
+## 0. Read-this-first — security & policy
+
+Both projects expose an **unauthenticated OpenAI-compatible endpoint** backed by *your personal ChatGPT OAuth tokens* (the same `auth.json` Codex uses). Anyone who can reach the endpoint gets to spend your ChatGPT rate limits.
+
+- `openai-oauth`'s own README: *"Use only for personal, local experimentation on trusted machines; **do not run as a hosted service**, do not share access, do not pool or redistribute tokens."* Running it on a public VPS behind a domain is exactly what that line warns against and risks OpenAI rate-limiting or suspending the account.
+- `ChatMock` uses the same tokens; the README is silent but the risk is identical.
+
+**Recommendation (baked into the plan below):** do not expose `/v1/*` to the public internet unauthenticated. Pick **one** of:
+1. **Private-only** — bind to `127.0.0.1`, access via SSH tunnel / Tailscale / WireGuard. *(Safest, recommended.)*
+2. **Public URL + Traefik basic-auth / IP allowlist middleware** — fine for solo use, still against `openai-oauth`'s stated policy.
+3. **Public + no auth** — not recommended; flagged for explicit sign-off only.
+
+Please tell me which tier you want before I execute.
+
+---
+
+## 1. Scope question for you
+
+`openai-oauth` and `ChatMock` do **the same thing** (localhost proxy → `chatgpt.com/backend-api/codex/responses` using your Codex OAuth). Choose one of:
+
+- **A. Both side-by-side** (different ports, share the same `~/.codex/auth.json` via bind-mount). Useful for A/B testing.
+- **B. ChatMock only** — it already ships a working `Dockerfile` + `docker-compose.yml` + login flow. Least work.
+- **C. openai-oauth only** — newer, Bun/TS, no upstream Docker support (we'd author it).
+
+Default in this plan = **A (both)**, with a note on what to drop if you pick B or C.
+
+---
+
+## 2. Layout on the VPS
+
+```
+/opt/chatgpt-proxies/
+├── auth/                       # shared ChatGPT OAuth credentials
+│   └── auth.json               # created by the login step below
+├── chatmock/
+│   └── docker-compose.yml      # from upstream, lightly edited
+│   └── .env                    # from .env.example
+└── openai-oauth/
+    ├── Dockerfile              # authored by us (see §4)
+    ├── docker-compose.yml      # authored by us
+    └── .dockerignore
+```
+
+One shared `auth.json` is mounted read-only into both containers at the path each expects (`/data/auth.json` for ChatMock via `CHATGPT_LOCAL_HOME=/data`; `/root/.codex/auth.json` for openai-oauth, overridable with `--oauth-file`).
+
+---
+
+## 3. ChatMock (upstream Docker assets exist)
+
+Upstream ships `Dockerfile`, `docker-compose.yml`, `DOCKER.md`, `.env.example`. Plan:
+
+1. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock`
+2. `cp .env.example .env`; set `VERBOSE=false`, keep `CHATMOCK_IMAGE=storagetime/chatmock:latest` (prebuilt) **or** switch to `build: .` to pin to the repo's own Dockerfile (safer than trusting the `storagetime/*` Docker Hub image — I recommend `build: .`).
+3. **Login (one-time, interactive):**
+   `docker compose run --rm --service-ports chatmock-login login`
+   → prints an auth URL, you paste it into a browser, complete ChatGPT login, paste the redirect URL back. Tokens are written into the `chatmock_data` volume.
+4. Optionally migrate the saved token to the shared bind-mount so `openai-oauth` can reuse it:
+   `docker run --rm -v chatmock_data:/src -v /opt/chatgpt-proxies/auth:/dst alpine cp /src/auth.json /dst/`
+5. Switch the main service to use the bind-mount instead of the named volume (edit compose):
+   ```yaml
+   volumes:
+     - /opt/chatgpt-proxies/auth:/data
+     - ./prompt.md:/app/prompt.md:ro
+   ```
+6. Don't publish `8000:8000` on `0.0.0.0`. Replace with `127.0.0.1:8000:8000` (private-only) **or** drop the `ports:` block and let Traefik route to it on the default bridge using labels (see §5).
+7. `docker compose up -d chatmock` and verify `curl http://127.0.0.1:8000/v1/models`.
+
+---
+
+## 4. openai-oauth (no upstream Docker; we author it)
+
+Repo is a Bun/TS monorepo (`bun@1.2.18`, `turbo`, `tsup`). CLI entry: `packages/openai-oauth/src/cli.ts`, built to `dist/cli.js`, defaults to binding `127.0.0.1:10531` and reading OAuth from `~/.codex/auth.json` (overridable via `--oauth-file`, `--host`, `--port`).
+
+**Dockerfile (authored by us, sketch):**
+```dockerfile
+FROM oven/bun:1.2.18-alpine AS build
+WORKDIR /src
+COPY . .
+RUN bun install --frozen-lockfile && bun run build
+
+FROM oven/bun:1.2.18-alpine
+WORKDIR /app
+COPY --from=build /src/packages/openai-oauth/dist ./dist
+COPY --from=build /src/packages/openai-oauth/package.json ./package.json
+COPY --from=build /src/node_modules ./node_modules
+EXPOSE 10531
+ENTRYPOINT ["bun", "dist/cli.js"]
+CMD ["--host", "0.0.0.0", "--port", "10531", "--oauth-file", "/auth/auth.json"]
+```
+
+**docker-compose.yml:**
+```yaml
+services:
+  openai-oauth:
+    build: .
+    container_name: openai-oauth
+    restart: unless-stopped
+    volumes:
+      - /opt/chatgpt-proxies/auth:/auth:ro
+    ports:
+      - "127.0.0.1:10531:10531"   # private-only; see §5 for Traefik variant
+    healthcheck:
+      test: ["CMD", "wget", "-qO-", "http://127.0.0.1:10531/v1/models"]
+      interval: 30s
+      timeout: 5s
+      retries: 3
+```
+
+**Login for openai-oauth:** the CLI intentionally does **not** ship a login flow. Options:
+- (a) Reuse the `auth.json` created by the ChatMock login step (§3.4) — zero extra work.
+- (b) On a machine with Node, run `npx @openai/codex login`, then `scp ~/.codex/auth.json root@2.24.201.210:/opt/chatgpt-proxies/auth/`.
+
+---
+
+## 5. Exposure (pick the tier from §0)
+
+**Tier 1 — private only (recommended):**
+- `ports:` are `127.0.0.1:8000:8000` and `127.0.0.1:10531:10531`.
+- Access from your laptop via `ssh -L 8000:localhost:8000 -L 10531:localhost:10531 root@2.24.201.210`.
+- No DNS / no Traefik route needed.
+
+**Tier 2 — public hostname + basic-auth (solo use):**
+- Decide subdomains (e.g. `chatmock.garzaos.cloud`, `openai-oauth.garzaos.cloud`) and point them at `2.24.201.210` (via Hostinger DNS / `$HOSTINGER_API_TOKEN`).
+- Create a shared Docker network `web`, attach Traefik + both services to it, drop the host-network setup **or** keep Traefik on host-net and attach services to the default `bridge` (works because Traefik talks to container IPs via labels; confirmed by inspecting the existing `traefik-traefik-1` container).
+- Add labels to each service:
+  ```yaml
+  labels:
+    - traefik.enable=true
+    - traefik.http.routers.chatmock.rule=Host(`chatmock.garzaos.cloud`)
+    - traefik.http.routers.chatmock.entrypoints=websecure
+    - traefik.http.routers.chatmock.tls.certresolver=letsencrypt
+    - traefik.http.services.chatmock.loadbalancer.server.port=8000
+    - traefik.http.routers.chatmock.middlewares=chatmock-auth
+    - traefik.http.middlewares.chatmock-auth.basicauth.users=USER:$$2y$$...   # htpasswd bcrypt
+  ```
+- Same pattern for `openai-oauth` on `:10531`.
+- Optional hardening middleware: `ipallowlist` for your home/office IP ranges.
+
+**Tier 3 — public + no auth:** same as tier 2, minus the `basicauth` / `ipallowlist` middlewares. Not recommended; requires explicit sign-off.
+
+---
+
+## 6. Concrete execution steps (once you approve a tier + scope)
+
+1. `ssh root@2.24.201.210`
+2. `mkdir -p /opt/chatgpt-proxies/{auth,chatmock,openai-oauth}`
+3. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock`
+4. `git clone https://github.com/EvanZhouDev/openai-oauth /opt/chatgpt-proxies/openai-oauth-src` → write our `Dockerfile` + `docker-compose.yml` into `/opt/chatgpt-proxies/openai-oauth/` that `build:` points at the cloned source.
+5. In `chatmock/`: `cp .env.example .env`, edit compose to `build: .` + bind-mount `/opt/chatgpt-proxies/auth:/data`, remove `ports:` (tier 2) or set `127.0.0.1:8000:8000` (tier 1). Add Traefik labels if tier 2.
+6. `docker compose run --rm --service-ports chatmock-login login` → complete OAuth in browser → verify `/opt/chatgpt-proxies/auth/auth.json` exists.
+7. (tier 2 only) `curl -u user:pass https://chatmock.garzaos.cloud/v1/models` to confirm cert issued and auth works.
+8. `docker compose up -d chatmock` (in chatmock dir) and `docker compose up -d openai-oauth` (in openai-oauth dir).
+9. Smoke test both:
+   - `curl http://127.0.0.1:8000/v1/models`
+   - `curl http://127.0.0.1:10531/v1/models`
+   - Chat completion: `curl http://127.0.0.1:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"gpt-5-codex","messages":[{"role":"user","content":"ping"}]}'`
+10. Systemd isn't needed — `restart: unless-stopped` on both services is enough.
+
+---
+
+## 7. Day-2 ops
+
+- **Updates:**
+  - ChatMock: `cd /opt/chatgpt-proxies/chatmock && git pull && docker compose build --pull && docker compose up -d`
+  - openai-oauth: same in its dir.
+- **Token refresh:** both libraries auto-refresh using the refresh token in `auth.json`. If the ChatGPT session is forcibly signed out, re-run the ChatMock login (§3.3) — it'll rewrite `/opt/chatgpt-proxies/auth/auth.json` in place.
+- **Backup:** `tar czf auth-backup.tgz /opt/chatgpt-proxies/auth` — `auth.json` is password-equivalent; store it like a secret.
+- **Logs:** `docker compose logs -f chatmock` / `docker compose logs -f openai-oauth`. ChatMock has `VERBOSE=true` for deep request/stream logs.
+- **Uninstall:** `docker compose down -v` in each dir and `rm -rf /opt/chatgpt-proxies`.
+
+---
+
+## 8. Things I need from you before executing
+
+1. **Scope:** A (both), B (ChatMock only), or C (openai-oauth only)?
+2. **Exposure tier:** 1 / 2 / 3 from §5?
+3. **If tier 2:** which domain(s)? (I can auto-create the DNS records via `$HOSTINGER_API_TOKEN` once you name them, or you can point any subdomain you already own at `2.24.201.210`.)
+4. **Confirm** you've read §0 and are okay with the "personal-use-only" policy trade-off of running these on a reachable server.
+
+Once you answer those, I'll execute §6 end-to-end on the VPS and report back with working endpoints + smoke-test output.
diff --git a/services/chatmock-shim/README.md b/services/chatmock-shim/README.md
@@ -0,0 +1,90 @@
+# chatmock shim — ChatGPT OAuth proxy (`https://llm.garzaos.cloud/v1`)
+
+OpenAI-compatible HTTP proxy that surfaces the user's interactive **ChatGPT** subscription as a
+standard `/v1/chat/completions` + `/v1/responses` API. Based on the open-source
+[`chatmock`](https://github.com/RayBytes/chatmock) project.
+
+- **Endpoint:** https://llm.garzaos.cloud/v1
+- **Host VPS:** Hostinger VPS `1589219` at IP `${SHIM_VPS_IP}` (separate from primary VPS)
+- **Path on host:** `/opt/chatgpt-proxies/chatmock/`
+- **Process:** Docker container `chatmock` fronted by Traefik
+
+## Why this exists
+
+Every autonomous-agent stack in this org (Nimbalyst / OpenHands / Agent Zero / Sim Studio / etc.)
+accepts a custom OpenAI-compatible base URL. Pointing them at this shim routes every model call
+through the user's ChatGPT Pro subscription → **$0 per request** regardless of caller.
+
+## Custom patches applied
+
+Two patches on top of upstream `chatmock`. Both are required for LiteLLM-based callers
+(OpenHands, Agent Zero, etc.) to get working tool-call responses.
+
+### 1. `model_registry.py` — alias non-`gpt-5*` slugs
+
+LiteLLM auto-routes any `gpt-5*` model slug through the **Responses API** (`/v1/responses`), which
+upstream ChatGPT returns as reasoning-only output (empty `output:[]`) when tools are defined.
+Aliasing `chatgpt-4o` / `gpt-4o` to the upstream `gpt-5.4` lets callers pick a slug that
+LiteLLM routes to `/v1/chat/completions` instead — which returns a proper message with tool calls.
+
+```python
+# /opt/chatgpt-proxies/chatmock/model_registry.py (lines 44-47)
+ModelSpec(
+    public_id="gpt-5.4",
+    aliases=("gpt5.4", "gpt-5.4-latest", "chatgpt-4o", "gpt-4o", "chatgpt"),
+    allowed_efforts=frozenset(("none", "low", "medium", "high", "xhigh")),
+```
+
+### 2. `responses_api.py` — strip rejected params
+
+Upstream ChatGPT Responses API rejects several OpenAI-style params that LiteLLM always sends.
+Strip them server-side before forwarding.
+
+```python
+# /opt/chatgpt-proxies/chatmock/responses_api.py (lines 91-95)
+normalized.pop("max_output_tokens", None)
+# Strip params that OpenAI Responses API rejects for gpt-5 reasoning models
+for _k in ("temperature", "top_p", "frequency_penalty", "presence_penalty",
+           "logit_bias", "logprobs", "top_logprobs"):
+    normalized.pop(_k, None)
+```
+
+## Known routing behavior
+
+| Caller | Model slug | Path used | Works? |
+|---|---|---|---|
+| LiteLLM (via OpenHands) | `openai/gpt-5.4` | `/v1/responses` | No — empty output when tools defined |
+| LiteLLM (via OpenHands) | `openai/chatgpt-4o` | `/v1/chat/completions` | **Yes** — this is the slug to use |
+| Codex CLI (Nimbalyst) | `gpt-5.4` | `/v1/responses` | Yes — Codex handles reasoning-only output |
+| Direct `curl` | any | either | Yes |
+
+## Verification
+
+```bash
+curl -sS https://llm.garzaos.cloud/v1/chat/completions \
+  -H "Authorization: Bearer $CUSTOM_LLM_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"chatgpt-4o","messages":[{"role":"user","content":"reply PONG"}]}' \
+  | jq -r '.choices[0].message.content'
+# Expected: PONG
+```
+
+## Install plan
+
+See `INSTALL-PLAN.md` for the original provisioning walkthrough (OAuth setup, Traefik route,
+container lifecycle).
+
+## Required secrets
+
+Never commit to this repo:
+
+| Variable | Description |
+|---|---|
+| `CUSTOM_LLM_API_KEY` | Bearer token accepted by the shim (issued by chatmock on deploy) |
+| `CHATGPT_OAUTH_REFRESH_TOKEN` | OAuth refresh token for upstream ChatGPT |
+| `SHIM_VPS_IP` | IP of the Hostinger VPS hosting this shim |
+
+## Related
+
+- `stacks/nimbalyst/` — Codex/Claude-Code web-desktop pointed at this shim
+- `services/openhands-byok/` — OpenHands Cloud BYOK config pointed at this shim
diff --git a/services/openhands-byok/README.md b/services/openhands-byok/README.md
@@ -0,0 +1,72 @@
+# OpenHands Cloud — BYOK + MCP configuration
+
+How OpenHands Cloud at https://app.all-hands.dev is configured to run against the chatmock shim
+(ChatGPT-backed, $0/request) with three additional MCP tools.
+
+## LLM (BYOK)
+
+Set via **Settings → LLM → Advanced**:
+
+| Field | Value |
+|---|---|
+| **Custom Model** | `openai/chatgpt-4o` |
+| **Base URL** | `https://llm.garzaos.cloud/v1` |
+| **API Key** | `${CUSTOM_LLM_API_KEY}` |
+
+### Why `chatgpt-4o` and not `gpt-5.4`
+
+LiteLLM (used by OpenHands) auto-routes any `gpt-5*` model slug through the Responses API
+(`/v1/responses`). The upstream ChatGPT backend returns empty `output:[]` on that path when
+tools are defined, causing OpenHands to loop with *"Your last response did not include a function
+call or a message."*
+
+The shim's `model_registry.py` was patched to accept `chatgpt-4o` as an alias → LiteLLM routes
+non-`gpt-5*` slugs through `/v1/chat/completions`, which returns proper tool-call messages.
+
+See `services/chatmock-shim/README.md` for shim details.
+
+## MCP tools
+
+Configured via **Settings → MCP**. All three are official upstream MCP servers.
+
+| Server | Transport | Config |
+|---|---|---|
+| **E2B** (code sandbox) | `STDIO` | Command: `uvx` · Args: `e2b-mcp-server` · Env: `E2B_API_KEY=${E2B_API_KEY}` |
+| **Firecrawl** (web scrape/crawl) | `SHTTP` | URL: `https://mcp.firecrawl.dev/${FIRECRAWL_API_KEY}/v2/mcp` |
+| **Tavily** (AI search) | `SHTTP` | URL: `https://mcp.tavily.com/mcp/?tavilyApiKey=${TAVILY_API_KEY}` |
+
+Pre-existing MCP servers (left in place):
+- `garza-mcp-unified` (SHTTP) — GARZA OS unified MCP gateway
+- `rube.app/mcp` (SHTTP) — rube.app toolbox
+
+## Required secrets (do not commit)
+
+| Variable | Source |
+|---|---|
+| `CUSTOM_LLM_API_KEY` | chatmock shim (Bearer token) |
+| `E2B_API_KEY` | https://e2b.dev/dashboard?tab=keys |
+| `FIRECRAWL_API_KEY` | https://www.firecrawl.dev/app/api-keys |
+| `TAVILY_API_KEY` | https://app.tavily.com/home |
+
+All four keys are recoverable from `/opt/surfsense/.env` on the primary VPS.
+
+## Files
+
+| Path | Purpose |
+|---|---|
+| `TEST-PLAN.md` | End-to-end BYOK verification plan (PONG round-trip, 4 assertions) |
+| `TEST-REPORT.md` | Verification results — all 4 assertions passed |
+
+## Verification (LLM path only)
+
+Start a new conversation at https://app.all-hands.dev and send:
+
+> Reply with the single word PONG and nothing else. Do not use any tools.
+
+Expected: reply contains `PONG`, model badge shows `openai/chatgpt-4o`, shim logs on the shim VPS
+show `POST /v1/chat/completions HTTP/1.1 200` at the reply timestamp.
+
+## Related
+
+- `services/chatmock-shim/` — the LLM proxy OpenHands points at
+- `stacks/nimbalyst/` — same BYOK pattern for Codex CLI on a self-hosted web-desktop
diff --git a/services/openhands-byok/TEST-PLAN.md b/services/openhands-byok/TEST-PLAN.md
@@ -0,0 +1,24 @@
+# OpenHands Cloud BYOK — End-to-end Test Plan (v3)
+
+## What changed since v2
+- Shim `model_registry.py`: added `chatgpt-4o` as an alias of upstream `gpt-5.4`. Container rebuilt + restarted. Direct curl with `model=chatgpt-4o` + function tools returns `content:"PONG"` (no empty output).
+- OpenHands Cloud → Settings → LLM (Advanced) → **Custom Model** changed from `openai/gpt-5.4` → `openai/chatgpt-4o`. Settings saved.
+- Rationale: LiteLLM auto-routes `gpt-5*` slugs through `/v1/responses` (Responses API). A non-`gpt-5*` slug like `chatgpt-4o` routes through `/v1/chat/completions`, which the shim has always handled correctly.
+
+## Primary flow
+1. Start a new conversation on app.all-hands.dev (existing ones may be pinned to old settings per "restart to see changes" toast).
+2. Send prompt: `Reply with the single word PONG and nothing else. Do not use any tools.`
+3. Wait ≤ 90 s.
+
+## Key assertions
+- **A1** — No LiteLLM error card (`BadRequestError`, `Unsupported parameter`, `APIConnectionError`, `AuthenticationError`) appears in the conversation.
+- **A2** — Agent produces a final assistant reply whose visible text contains `PONG` within 90 s. NO looping `"Your last response did not include a function call or a message"` errors.
+- **A3** — No OpenHands-credits / trial / "configure LLM" banner/toast. (Would indicate BYOK was silently bypassed.)
+- **A4** — `chatmock` container on VPS `2.24.201.210` logs show `POST /v1/chat/completions` (NOT `/v1/responses`) with `model: chatgpt-4o` at the timestamp of the reply.
+
+## Out-of-scope
+- Tool-use capability (the prompt explicitly asks for no tools). A follow-up test with an actual tool task can come later.
+- Multi-turn reasoning. This is the minimum sanity check that BYOK routing works end-to-end.
+
+## Pass/fail
+Pass = A1 + A2 + A3 + A4 all true. Anything else = fail; report cause.