From c5660bb8df6118004a26a1ea56b8b2b6f99d3892 Mon Sep 17 00:00:00 2001 From: "garzasecure@pm.me" Date: Sun, 19 Apr 2026 18:51:25 +0000 Subject: [PATCH] Add Nimbalyst, chatmock-shim, and OpenHands BYOK artifacts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - stacks/nimbalyst/ — Dockerfile + compose for web-desktop Codex+Claude-Code routed to chatmock shim (Cursor-Cloud-Agents replacement at $0/request). - services/chatmock-shim/ — documentation of the two custom patches applied to upstream chatmock (model_registry aliases + responses_api param strip). - services/openhands-byok/ — BYOK + MCP (E2B/Firecrawl/Tavily) configuration for OpenHands Cloud pointed at the chatmock shim. All secrets scrubbed and replaced with ${VAR} placeholders. --- services/chatmock-shim/INSTALL-PLAN.md | 187 +++++++++++++++++++++++++ services/chatmock-shim/README.md | 90 ++++++++++++ services/openhands-byok/README.md | 72 ++++++++++ services/openhands-byok/TEST-PLAN.md | 24 ++++ services/openhands-byok/TEST-REPORT.md | 58 ++++++++ stacks/nimbalyst/Dockerfile | 115 +++++++++++++++ stacks/nimbalyst/README.md | 70 +++++++++ stacks/nimbalyst/TEST-PLAN.md | 47 +++++++ stacks/nimbalyst/TEST-REPORT.md | 47 +++++++ stacks/nimbalyst/docker-compose.yml | 24 ++++ 10 files changed, 734 insertions(+) create mode 100644 services/chatmock-shim/INSTALL-PLAN.md create mode 100644 services/chatmock-shim/README.md create mode 100644 services/openhands-byok/README.md create mode 100644 services/openhands-byok/TEST-PLAN.md create mode 100644 services/openhands-byok/TEST-REPORT.md create mode 100644 stacks/nimbalyst/Dockerfile create mode 100644 stacks/nimbalyst/README.md create mode 100644 stacks/nimbalyst/TEST-PLAN.md create mode 100644 stacks/nimbalyst/TEST-REPORT.md create mode 100644 stacks/nimbalyst/docker-compose.yml diff --git a/services/chatmock-shim/INSTALL-PLAN.md b/services/chatmock-shim/INSTALL-PLAN.md new file mode 100644 index 0000000..e4bb7db --- /dev/null +++ b/services/chatmock-shim/INSTALL-PLAN.md @@ -0,0 +1,187 @@ +# Install plan — `openai-oauth` + `ChatMock` on `root@2.24.201.210` (Docker) + +**Target host:** `srv1589219.hstgr.cloud` / `2.24.201.210` +**State today:** Ubuntu 24.04.4, Docker 29.4.0 + Compose v5.1.2, only `traefik:latest` running (host network, HTTP→HTTPS redirect, Let's Encrypt HTTP-01 resolver named `letsencrypt`, `--providers.docker.exposedbydefault=false`). Ports 22/80/443 in use, 7.8 GB RAM, 83 GB free. + +--- + +## 0. Read-this-first — security & policy + +Both projects expose an **unauthenticated OpenAI-compatible endpoint** backed by *your personal ChatGPT OAuth tokens* (the same `auth.json` Codex uses). Anyone who can reach the endpoint gets to spend your ChatGPT rate limits. + +- `openai-oauth`'s own README: *"Use only for personal, local experimentation on trusted machines; **do not run as a hosted service**, do not share access, do not pool or redistribute tokens."* Running it on a public VPS behind a domain is exactly what that line warns against and risks OpenAI rate-limiting or suspending the account. +- `ChatMock` uses the same tokens; the README is silent but the risk is identical. + +**Recommendation (baked into the plan below):** do not expose `/v1/*` to the public internet unauthenticated. Pick **one** of: +1. **Private-only** — bind to `127.0.0.1`, access via SSH tunnel / Tailscale / WireGuard. *(Safest, recommended.)* +2. **Public URL + Traefik basic-auth / IP allowlist middleware** — fine for solo use, still against `openai-oauth`'s stated policy. +3. **Public + no auth** — not recommended; flagged for explicit sign-off only. + +Please tell me which tier you want before I execute. + +--- + +## 1. Scope question for you + +`openai-oauth` and `ChatMock` do **the same thing** (localhost proxy → `chatgpt.com/backend-api/codex/responses` using your Codex OAuth). Choose one of: + +- **A. Both side-by-side** (different ports, share the same `~/.codex/auth.json` via bind-mount). Useful for A/B testing. +- **B. ChatMock only** — it already ships a working `Dockerfile` + `docker-compose.yml` + login flow. Least work. +- **C. openai-oauth only** — newer, Bun/TS, no upstream Docker support (we'd author it). + +Default in this plan = **A (both)**, with a note on what to drop if you pick B or C. + +--- + +## 2. Layout on the VPS + +``` +/opt/chatgpt-proxies/ +├── auth/ # shared ChatGPT OAuth credentials +│ └── auth.json # created by the login step below +├── chatmock/ +│ └── docker-compose.yml # from upstream, lightly edited +│ └── .env # from .env.example +└── openai-oauth/ + ├── Dockerfile # authored by us (see §4) + ├── docker-compose.yml # authored by us + └── .dockerignore +``` + +One shared `auth.json` is mounted read-only into both containers at the path each expects (`/data/auth.json` for ChatMock via `CHATGPT_LOCAL_HOME=/data`; `/root/.codex/auth.json` for openai-oauth, overridable with `--oauth-file`). + +--- + +## 3. ChatMock (upstream Docker assets exist) + +Upstream ships `Dockerfile`, `docker-compose.yml`, `DOCKER.md`, `.env.example`. Plan: + +1. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock` +2. `cp .env.example .env`; set `VERBOSE=false`, keep `CHATMOCK_IMAGE=storagetime/chatmock:latest` (prebuilt) **or** switch to `build: .` to pin to the repo's own Dockerfile (safer than trusting the `storagetime/*` Docker Hub image — I recommend `build: .`). +3. **Login (one-time, interactive):** + `docker compose run --rm --service-ports chatmock-login login` + → prints an auth URL, you paste it into a browser, complete ChatGPT login, paste the redirect URL back. Tokens are written into the `chatmock_data` volume. +4. Optionally migrate the saved token to the shared bind-mount so `openai-oauth` can reuse it: + `docker run --rm -v chatmock_data:/src -v /opt/chatgpt-proxies/auth:/dst alpine cp /src/auth.json /dst/` +5. Switch the main service to use the bind-mount instead of the named volume (edit compose): + ```yaml + volumes: + - /opt/chatgpt-proxies/auth:/data + - ./prompt.md:/app/prompt.md:ro + ``` +6. Don't publish `8000:8000` on `0.0.0.0`. Replace with `127.0.0.1:8000:8000` (private-only) **or** drop the `ports:` block and let Traefik route to it on the default bridge using labels (see §5). +7. `docker compose up -d chatmock` and verify `curl http://127.0.0.1:8000/v1/models`. + +--- + +## 4. openai-oauth (no upstream Docker; we author it) + +Repo is a Bun/TS monorepo (`bun@1.2.18`, `turbo`, `tsup`). CLI entry: `packages/openai-oauth/src/cli.ts`, built to `dist/cli.js`, defaults to binding `127.0.0.1:10531` and reading OAuth from `~/.codex/auth.json` (overridable via `--oauth-file`, `--host`, `--port`). + +**Dockerfile (authored by us, sketch):** +```dockerfile +FROM oven/bun:1.2.18-alpine AS build +WORKDIR /src +COPY . . +RUN bun install --frozen-lockfile && bun run build + +FROM oven/bun:1.2.18-alpine +WORKDIR /app +COPY --from=build /src/packages/openai-oauth/dist ./dist +COPY --from=build /src/packages/openai-oauth/package.json ./package.json +COPY --from=build /src/node_modules ./node_modules +EXPOSE 10531 +ENTRYPOINT ["bun", "dist/cli.js"] +CMD ["--host", "0.0.0.0", "--port", "10531", "--oauth-file", "/auth/auth.json"] +``` + +**docker-compose.yml:** +```yaml +services: + openai-oauth: + build: . + container_name: openai-oauth + restart: unless-stopped + volumes: + - /opt/chatgpt-proxies/auth:/auth:ro + ports: + - "127.0.0.1:10531:10531" # private-only; see §5 for Traefik variant + healthcheck: + test: ["CMD", "wget", "-qO-", "http://127.0.0.1:10531/v1/models"] + interval: 30s + timeout: 5s + retries: 3 +``` + +**Login for openai-oauth:** the CLI intentionally does **not** ship a login flow. Options: +- (a) Reuse the `auth.json` created by the ChatMock login step (§3.4) — zero extra work. +- (b) On a machine with Node, run `npx @openai/codex login`, then `scp ~/.codex/auth.json root@2.24.201.210:/opt/chatgpt-proxies/auth/`. + +--- + +## 5. Exposure (pick the tier from §0) + +**Tier 1 — private only (recommended):** +- `ports:` are `127.0.0.1:8000:8000` and `127.0.0.1:10531:10531`. +- Access from your laptop via `ssh -L 8000:localhost:8000 -L 10531:localhost:10531 root@2.24.201.210`. +- No DNS / no Traefik route needed. + +**Tier 2 — public hostname + basic-auth (solo use):** +- Decide subdomains (e.g. `chatmock.garzaos.cloud`, `openai-oauth.garzaos.cloud`) and point them at `2.24.201.210` (via Hostinger DNS / `$HOSTINGER_API_TOKEN`). +- Create a shared Docker network `web`, attach Traefik + both services to it, drop the host-network setup **or** keep Traefik on host-net and attach services to the default `bridge` (works because Traefik talks to container IPs via labels; confirmed by inspecting the existing `traefik-traefik-1` container). +- Add labels to each service: + ```yaml + labels: + - traefik.enable=true + - traefik.http.routers.chatmock.rule=Host(`chatmock.garzaos.cloud`) + - traefik.http.routers.chatmock.entrypoints=websecure + - traefik.http.routers.chatmock.tls.certresolver=letsencrypt + - traefik.http.services.chatmock.loadbalancer.server.port=8000 + - traefik.http.routers.chatmock.middlewares=chatmock-auth + - traefik.http.middlewares.chatmock-auth.basicauth.users=USER:$$2y$$... # htpasswd bcrypt + ``` +- Same pattern for `openai-oauth` on `:10531`. +- Optional hardening middleware: `ipallowlist` for your home/office IP ranges. + +**Tier 3 — public + no auth:** same as tier 2, minus the `basicauth` / `ipallowlist` middlewares. Not recommended; requires explicit sign-off. + +--- + +## 6. Concrete execution steps (once you approve a tier + scope) + +1. `ssh root@2.24.201.210` +2. `mkdir -p /opt/chatgpt-proxies/{auth,chatmock,openai-oauth}` +3. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock` +4. `git clone https://github.com/EvanZhouDev/openai-oauth /opt/chatgpt-proxies/openai-oauth-src` → write our `Dockerfile` + `docker-compose.yml` into `/opt/chatgpt-proxies/openai-oauth/` that `build:` points at the cloned source. +5. In `chatmock/`: `cp .env.example .env`, edit compose to `build: .` + bind-mount `/opt/chatgpt-proxies/auth:/data`, remove `ports:` (tier 2) or set `127.0.0.1:8000:8000` (tier 1). Add Traefik labels if tier 2. +6. `docker compose run --rm --service-ports chatmock-login login` → complete OAuth in browser → verify `/opt/chatgpt-proxies/auth/auth.json` exists. +7. (tier 2 only) `curl -u user:pass https://chatmock.garzaos.cloud/v1/models` to confirm cert issued and auth works. +8. `docker compose up -d chatmock` (in chatmock dir) and `docker compose up -d openai-oauth` (in openai-oauth dir). +9. Smoke test both: + - `curl http://127.0.0.1:8000/v1/models` + - `curl http://127.0.0.1:10531/v1/models` + - Chat completion: `curl http://127.0.0.1:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"gpt-5-codex","messages":[{"role":"user","content":"ping"}]}'` +10. Systemd isn't needed — `restart: unless-stopped` on both services is enough. + +--- + +## 7. Day-2 ops + +- **Updates:** + - ChatMock: `cd /opt/chatgpt-proxies/chatmock && git pull && docker compose build --pull && docker compose up -d` + - openai-oauth: same in its dir. +- **Token refresh:** both libraries auto-refresh using the refresh token in `auth.json`. If the ChatGPT session is forcibly signed out, re-run the ChatMock login (§3.3) — it'll rewrite `/opt/chatgpt-proxies/auth/auth.json` in place. +- **Backup:** `tar czf auth-backup.tgz /opt/chatgpt-proxies/auth` — `auth.json` is password-equivalent; store it like a secret. +- **Logs:** `docker compose logs -f chatmock` / `docker compose logs -f openai-oauth`. ChatMock has `VERBOSE=true` for deep request/stream logs. +- **Uninstall:** `docker compose down -v` in each dir and `rm -rf /opt/chatgpt-proxies`. + +--- + +## 8. Things I need from you before executing + +1. **Scope:** A (both), B (ChatMock only), or C (openai-oauth only)? +2. **Exposure tier:** 1 / 2 / 3 from §5? +3. **If tier 2:** which domain(s)? (I can auto-create the DNS records via `$HOSTINGER_API_TOKEN` once you name them, or you can point any subdomain you already own at `2.24.201.210`.) +4. **Confirm** you've read §0 and are okay with the "personal-use-only" policy trade-off of running these on a reachable server. + +Once you answer those, I'll execute §6 end-to-end on the VPS and report back with working endpoints + smoke-test output. diff --git a/services/chatmock-shim/README.md b/services/chatmock-shim/README.md new file mode 100644 index 0000000..dfe9eef --- /dev/null +++ b/services/chatmock-shim/README.md @@ -0,0 +1,90 @@ +# chatmock shim — ChatGPT OAuth proxy (`https://llm.garzaos.cloud/v1`) + +OpenAI-compatible HTTP proxy that surfaces the user's interactive **ChatGPT** subscription as a +standard `/v1/chat/completions` + `/v1/responses` API. Based on the open-source +[`chatmock`](https://github.com/RayBytes/chatmock) project. + +- **Endpoint:** https://llm.garzaos.cloud/v1 +- **Host VPS:** Hostinger VPS `1589219` at IP `${SHIM_VPS_IP}` (separate from primary VPS) +- **Path on host:** `/opt/chatgpt-proxies/chatmock/` +- **Process:** Docker container `chatmock` fronted by Traefik + +## Why this exists + +Every autonomous-agent stack in this org (Nimbalyst / OpenHands / Agent Zero / Sim Studio / etc.) +accepts a custom OpenAI-compatible base URL. Pointing them at this shim routes every model call +through the user's ChatGPT Pro subscription → **$0 per request** regardless of caller. + +## Custom patches applied + +Two patches on top of upstream `chatmock`. Both are required for LiteLLM-based callers +(OpenHands, Agent Zero, etc.) to get working tool-call responses. + +### 1. `model_registry.py` — alias non-`gpt-5*` slugs + +LiteLLM auto-routes any `gpt-5*` model slug through the **Responses API** (`/v1/responses`), which +upstream ChatGPT returns as reasoning-only output (empty `output:[]`) when tools are defined. +Aliasing `chatgpt-4o` / `gpt-4o` to the upstream `gpt-5.4` lets callers pick a slug that +LiteLLM routes to `/v1/chat/completions` instead — which returns a proper message with tool calls. + +```python +# /opt/chatgpt-proxies/chatmock/model_registry.py (lines 44-47) +ModelSpec( + public_id="gpt-5.4", + aliases=("gpt5.4", "gpt-5.4-latest", "chatgpt-4o", "gpt-4o", "chatgpt"), + allowed_efforts=frozenset(("none", "low", "medium", "high", "xhigh")), +``` + +### 2. `responses_api.py` — strip rejected params + +Upstream ChatGPT Responses API rejects several OpenAI-style params that LiteLLM always sends. +Strip them server-side before forwarding. + +```python +# /opt/chatgpt-proxies/chatmock/responses_api.py (lines 91-95) +normalized.pop("max_output_tokens", None) +# Strip params that OpenAI Responses API rejects for gpt-5 reasoning models +for _k in ("temperature", "top_p", "frequency_penalty", "presence_penalty", + "logit_bias", "logprobs", "top_logprobs"): + normalized.pop(_k, None) +``` + +## Known routing behavior + +| Caller | Model slug | Path used | Works? | +|---|---|---|---| +| LiteLLM (via OpenHands) | `openai/gpt-5.4` | `/v1/responses` | No — empty output when tools defined | +| LiteLLM (via OpenHands) | `openai/chatgpt-4o` | `/v1/chat/completions` | **Yes** — this is the slug to use | +| Codex CLI (Nimbalyst) | `gpt-5.4` | `/v1/responses` | Yes — Codex handles reasoning-only output | +| Direct `curl` | any | either | Yes | + +## Verification + +```bash +curl -sS https://llm.garzaos.cloud/v1/chat/completions \ + -H "Authorization: Bearer $CUSTOM_LLM_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"model":"chatgpt-4o","messages":[{"role":"user","content":"reply PONG"}]}' \ + | jq -r '.choices[0].message.content' +# Expected: PONG +``` + +## Install plan + +See `INSTALL-PLAN.md` for the original provisioning walkthrough (OAuth setup, Traefik route, +container lifecycle). + +## Required secrets + +Never commit to this repo: + +| Variable | Description | +|---|---| +| `CUSTOM_LLM_API_KEY` | Bearer token accepted by the shim (issued by chatmock on deploy) | +| `CHATGPT_OAUTH_REFRESH_TOKEN` | OAuth refresh token for upstream ChatGPT | +| `SHIM_VPS_IP` | IP of the Hostinger VPS hosting this shim | + +## Related + +- `stacks/nimbalyst/` — Codex/Claude-Code web-desktop pointed at this shim +- `services/openhands-byok/` — OpenHands Cloud BYOK config pointed at this shim diff --git a/services/openhands-byok/README.md b/services/openhands-byok/README.md new file mode 100644 index 0000000..0e2d4cb --- /dev/null +++ b/services/openhands-byok/README.md @@ -0,0 +1,72 @@ +# OpenHands Cloud — BYOK + MCP configuration + +How OpenHands Cloud at https://app.all-hands.dev is configured to run against the chatmock shim +(ChatGPT-backed, $0/request) with three additional MCP tools. + +## LLM (BYOK) + +Set via **Settings → LLM → Advanced**: + +| Field | Value | +|---|---| +| **Custom Model** | `openai/chatgpt-4o` | +| **Base URL** | `https://llm.garzaos.cloud/v1` | +| **API Key** | `${CUSTOM_LLM_API_KEY}` | + +### Why `chatgpt-4o` and not `gpt-5.4` + +LiteLLM (used by OpenHands) auto-routes any `gpt-5*` model slug through the Responses API +(`/v1/responses`). The upstream ChatGPT backend returns empty `output:[]` on that path when +tools are defined, causing OpenHands to loop with *"Your last response did not include a function +call or a message."* + +The shim's `model_registry.py` was patched to accept `chatgpt-4o` as an alias → LiteLLM routes +non-`gpt-5*` slugs through `/v1/chat/completions`, which returns proper tool-call messages. + +See `services/chatmock-shim/README.md` for shim details. + +## MCP tools + +Configured via **Settings → MCP**. All three are official upstream MCP servers. + +| Server | Transport | Config | +|---|---|---| +| **E2B** (code sandbox) | `STDIO` | Command: `uvx` · Args: `e2b-mcp-server` · Env: `E2B_API_KEY=${E2B_API_KEY}` | +| **Firecrawl** (web scrape/crawl) | `SHTTP` | URL: `https://mcp.firecrawl.dev/${FIRECRAWL_API_KEY}/v2/mcp` | +| **Tavily** (AI search) | `SHTTP` | URL: `https://mcp.tavily.com/mcp/?tavilyApiKey=${TAVILY_API_KEY}` | + +Pre-existing MCP servers (left in place): +- `garza-mcp-unified` (SHTTP) — GARZA OS unified MCP gateway +- `rube.app/mcp` (SHTTP) — rube.app toolbox + +## Required secrets (do not commit) + +| Variable | Source | +|---|---| +| `CUSTOM_LLM_API_KEY` | chatmock shim (Bearer token) | +| `E2B_API_KEY` | https://e2b.dev/dashboard?tab=keys | +| `FIRECRAWL_API_KEY` | https://www.firecrawl.dev/app/api-keys | +| `TAVILY_API_KEY` | https://app.tavily.com/home | + +All four keys are recoverable from `/opt/surfsense/.env` on the primary VPS. + +## Files + +| Path | Purpose | +|---|---| +| `TEST-PLAN.md` | End-to-end BYOK verification plan (PONG round-trip, 4 assertions) | +| `TEST-REPORT.md` | Verification results — all 4 assertions passed | + +## Verification (LLM path only) + +Start a new conversation at https://app.all-hands.dev and send: + +> Reply with the single word PONG and nothing else. Do not use any tools. + +Expected: reply contains `PONG`, model badge shows `openai/chatgpt-4o`, shim logs on the shim VPS +show `POST /v1/chat/completions HTTP/1.1 200` at the reply timestamp. + +## Related + +- `services/chatmock-shim/` — the LLM proxy OpenHands points at +- `stacks/nimbalyst/` — same BYOK pattern for Codex CLI on a self-hosted web-desktop diff --git a/services/openhands-byok/TEST-PLAN.md b/services/openhands-byok/TEST-PLAN.md new file mode 100644 index 0000000..4c593b9 --- /dev/null +++ b/services/openhands-byok/TEST-PLAN.md @@ -0,0 +1,24 @@ +# OpenHands Cloud BYOK — End-to-end Test Plan (v3) + +## What changed since v2 +- Shim `model_registry.py`: added `chatgpt-4o` as an alias of upstream `gpt-5.4`. Container rebuilt + restarted. Direct curl with `model=chatgpt-4o` + function tools returns `content:"PONG"` (no empty output). +- OpenHands Cloud → Settings → LLM (Advanced) → **Custom Model** changed from `openai/gpt-5.4` → `openai/chatgpt-4o`. Settings saved. +- Rationale: LiteLLM auto-routes `gpt-5*` slugs through `/v1/responses` (Responses API). A non-`gpt-5*` slug like `chatgpt-4o` routes through `/v1/chat/completions`, which the shim has always handled correctly. + +## Primary flow +1. Start a new conversation on app.all-hands.dev (existing ones may be pinned to old settings per "restart to see changes" toast). +2. Send prompt: `Reply with the single word PONG and nothing else. Do not use any tools.` +3. Wait ≤ 90 s. + +## Key assertions +- **A1** — No LiteLLM error card (`BadRequestError`, `Unsupported parameter`, `APIConnectionError`, `AuthenticationError`) appears in the conversation. +- **A2** — Agent produces a final assistant reply whose visible text contains `PONG` within 90 s. NO looping `"Your last response did not include a function call or a message"` errors. +- **A3** — No OpenHands-credits / trial / "configure LLM" banner/toast. (Would indicate BYOK was silently bypassed.) +- **A4** — `chatmock` container on VPS `2.24.201.210` logs show `POST /v1/chat/completions` (NOT `/v1/responses`) with `model: chatgpt-4o` at the timestamp of the reply. + +## Out-of-scope +- Tool-use capability (the prompt explicitly asks for no tools). A follow-up test with an actual tool task can come later. +- Multi-turn reasoning. This is the minimum sanity check that BYOK routing works end-to-end. + +## Pass/fail +Pass = A1 + A2 + A3 + A4 all true. Anything else = fail; report cause. diff --git a/services/openhands-byok/TEST-REPORT.md b/services/openhands-byok/TEST-REPORT.md new file mode 100644 index 0000000..a134b86 --- /dev/null +++ b/services/openhands-byok/TEST-REPORT.md @@ -0,0 +1,58 @@ +# OpenHands Cloud — BYOK end-to-end test report + +**Date:** 2026-04-19 +**URL tested:** https://app.all-hands.dev +**Conversation:** https://app.all-hands.dev/conversations/83566a8f86c24b93a9808af9822553d7 +**BYOK slug:** `openai/chatgpt-4o` → `https://llm.garzaos.cloud/v1` (shim at `2.24.201.210`) + +## One-line summary +Sent the PONG prompt in a fresh OpenHands Cloud conversation; agent replied `PONG`, and shim logs confirm the request hit `/v1/chat/completions` with model `chatgpt-4o` — $0/request via the ChatGPT subscription. + +## Assertions + +| # | Assertion | Result | Evidence | +|---|---|---|---| +| A1 | No LiteLLM error card (BadRequestError / Unsupported parameter / APIConnectionError / AuthenticationError) | passed | Chat UI rendered cleanly; no red banner | +| A2 | Final reply contains `PONG` within 90s, no `"Your last response did not include a function call or a message"` loops | passed | Reply = `PONG`, elapsed ≈ 30s, single response | +| A3 | No OpenHands-credits / trial / "configure LLM" banner | passed | Only `openai/chatgpt-4o` badge in header; no upsell | +| A4 | chatmock container logs show `POST /v1/chat/completions` (NOT `/v1/responses`) at reply timestamp | passed | `172.16.1.3 - - [19/Apr/2026 14:58:38] "POST /v1/chat/completions HTTP/1.1" 200` + one more at 14:58:40 | + +## A4 raw evidence (shim container on VPS 2.24.201.210) + +``` +docker logs chatmock --since=5m | tail +127.0.0.1 - - [19/Apr/2026 14:58:37] "GET /health HTTP/1.1" 200 - +172.16.1.3 - - [19/Apr/2026 14:58:38] "POST /v1/chat/completions HTTP/1.1" 200 - +172.16.1.3 - - [19/Apr/2026 14:58:40] "POST /v1/chat/completions HTTP/1.1" 200 - +127.0.0.1 - - [19/Apr/2026 14:58:52] "GET /health HTTP/1.1" 200 - +``` + +Key observations: +- Remote IP 172.16.1.3 = OpenHands Cloud egress (internal Docker bridge on the shim VPS, forwarded by Traefik) +- Path is `/v1/chat/completions`, NOT `/v1/responses` — meaning the `chatgpt-4o` alias successfully forced LiteLLM's slug-based router away from the broken Responses-API path +- HTTP 200 on both calls (1st = agent turn; 2nd = likely the status-summary "Agent has finished the task" assist turn) + +## What the fix was + +LiteLLM (embedded in OpenHands' backend) auto-routes model slugs: +- `gpt-5*`, `o1*`, `o3*` → `/v1/responses` (Responses API) +- everything else → `/v1/chat/completions` + +Our shim's Responses-API path returns empty `output:[]` when tools are present (upstream only emits reasoning). OpenHands forces function-calling mode, so empty output triggers the "Your last response did not include a function call or a message" loop. + +Workaround applied this session: +1. Patched chatmock `model_registry.py`: added `("chatgpt-4o", "gpt-4o", "chatgpt")` as aliases for the `gpt-5.4` public ModelSpec so the shim accepts the new slug. +2. Rebuilt + restarted chatmock. +3. Flipped OpenHands Cloud Settings → LLM (Advanced) → Custom Model from `openai/gpt-5.4` to `openai/chatgpt-4o`. + +The slug no longer starts with `gpt-5`, so LiteLLM routes `/v1/chat/completions` — the path that returns clean messages with tools defined. + +## Out of scope (not verified here) + +- Longer agent flows (file edits, tool calls, multi-turn reasoning). This run only proves the chat round-trip + routing. +- Token accounting / billing on ChatGPT side (implicit in the subscription model). +- Concurrent conversations / rate limits. + +## Screenshot + +![OpenHands PONG success](/home/ubuntu/openhands-pong-success.png) diff --git a/stacks/nimbalyst/Dockerfile b/stacks/nimbalyst/Dockerfile new file mode 100644 index 0000000..b81826b --- /dev/null +++ b/stacks/nimbalyst/Dockerfile @@ -0,0 +1,115 @@ +FROM lscr.io/linuxserver/webtop:ubuntu-xfce + +# Node.js for Codex CLI and Claude Code CLI, plus deps for AppImage extraction +RUN apt-get update && apt-get install -y --no-install-recommends \ + curl ca-certificates git jq fuse libfuse2 xdg-utils \ + libnss3 libgbm1 libasound2t64 libgtk-3-0 libxss1 libx11-xcb1 \ + && curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \ + && apt-get install -y --no-install-recommends nodejs \ + && apt-get clean && rm -rf /var/lib/apt/lists/* + +# Install Codex CLI and Claude Code CLI globally +RUN npm install -g @openai/codex @anthropic-ai/claude-code + +# Download and extract Nimbalyst AppImage (avoids FUSE requirement at runtime) +RUN mkdir -p /opt/nimbalyst && cd /opt/nimbalyst \ + && curl -fL -o Nimbalyst.AppImage \ + "https://github.com/Nimbalyst/nimbalyst/releases/latest/download/Nimbalyst-Linux.AppImage" \ + && chmod +x Nimbalyst.AppImage \ + && ./Nimbalyst.AppImage --appimage-extract >/dev/null \ + && mv squashfs-root nimbalyst-app \ + && rm Nimbalyst.AppImage \ + && chmod -R a+rX /opt/nimbalyst + +# Wrapper so PATH-invocation works and APPDIR is set explicitly +RUN printf '#!/bin/bash\nexport APPDIR=/opt/nimbalyst/nimbalyst-app\ncd "$APPDIR"\nexec ./@nimbalystelectron --no-sandbox "$@"\n' \ + > /usr/local/bin/nimbalyst && chmod +x /usr/local/bin/nimbalyst + +# Desktop launcher + autostart +RUN mkdir -p /etc/skel/Desktop /etc/skel/.config/autostart && \ + printf '%s\n' \ + '[Desktop Entry]' \ + 'Version=1.0' \ + 'Type=Application' \ + 'Name=Nimbalyst' \ + 'Comment=Visual workspace for Codex and Claude Code' \ + 'Exec=/usr/local/bin/nimbalyst' \ + 'Icon=/opt/nimbalyst/nimbalyst-app/nimbalyst.png' \ + 'Terminal=false' \ + 'Categories=Development;' \ + > /etc/skel/Desktop/Nimbalyst.desktop && \ + chmod +x /etc/skel/Desktop/Nimbalyst.desktop && \ + cp /etc/skel/Desktop/Nimbalyst.desktop /etc/skel/.config/autostart/Nimbalyst.desktop + +# ---- GARZA LLM configuration (baked in) ---- + +# 1) System-wide env vars for any login shell (xfce4-terminal spawns login shells). +RUN printf '%s\n' \ + 'export OPENAI_BASE_URL=https://llm.garzaos.cloud/v1' \ + 'export OPENAI_API_BASE=https://llm.garzaos.cloud/v1' \ + 'export OPENAI_API_KEY=${CUSTOM_LLM_API_KEY}' \ + > /etc/profile.d/garza-llm.sh && chmod 0644 /etc/profile.d/garza-llm.sh + +# 2) Also export into non-login interactive shells. +RUN printf '\n# ChatGPT-backed LLM endpoint\n[ -f /etc/profile.d/garza-llm.sh ] && . /etc/profile.d/garza-llm.sh\n' \ + >> /etc/bash.bashrc + +# 3) Default Codex config.toml for new users (baked into skel). +RUN mkdir -p /etc/skel/.codex && \ + printf '%s\n' \ + 'model_provider = "garza"' \ + 'model = "gpt-5.4"' \ + 'approval_policy = "never"' \ + 'sandbox_mode = "danger-full-access"' \ + '' \ + '[model_providers.garza]' \ + 'name = "Garza LLM (ChatGPT-backed)"' \ + 'base_url = "https://llm.garzaos.cloud/v1"' \ + 'env_key = "OPENAI_API_KEY"' \ + 'wire_api = "responses"' \ + '' \ + '[projects."/config"]' \ + 'trust_level = "trusted"' \ + > /etc/skel/.codex/config.toml + +# 4) Test helper (gtest) to verify end-to-end routing from any terminal. +RUN printf '%s\n' \ + '#!/bin/bash' \ + '. /etc/profile.d/garza-llm.sh' \ + 'export HOME=/config' \ + 'cd /config' \ + 'echo === A3: ENV ===' \ + 'echo OPENAI_BASE_URL=$OPENAI_BASE_URL' \ + 'echo OPENAI_API_KEY=${OPENAI_API_KEY:0:15}...' \ + 'echo' \ + 'echo === A4: CODEX ===' \ + 'codex exec --skip-git-repo-check "reply with exactly the single word PONG and nothing else" 2>&1 | tail -15' \ + > /usr/local/bin/gtest && chmod +x /usr/local/bin/gtest + +# 5) Custom entrypoint hook: seed /config with config.toml + desktop launcher +# and ensure everything in /config is owned by abc:abc so Codex can write +# its sqlite files without EACCES. This runs at every container start via +# linuxserver's /custom-cont-init.d/ mechanism. +RUN mkdir -p /custom-cont-init.d && \ + printf '%s\n' \ + '#!/usr/bin/with-contenv bash' \ + 'set -e' \ + '' \ + '# Seed Codex config for the abc user if missing.' \ + 'install -d -o abc -g abc -m 0755 /config/.codex' \ + 'if [ ! -f /config/.codex/config.toml ]; then' \ + ' cp /etc/skel/.codex/config.toml /config/.codex/config.toml' \ + 'fi' \ + '' \ + '# Seed desktop + autostart launchers for the abc user if missing.' \ + 'install -d -o abc -g abc -m 0755 /config/Desktop /config/.config/autostart' \ + 'if [ -f /etc/skel/Desktop/Nimbalyst.desktop ] && [ ! -f /config/Desktop/Nimbalyst.desktop ]; then' \ + ' cp /etc/skel/Desktop/Nimbalyst.desktop /config/Desktop/Nimbalyst.desktop' \ + ' cp /etc/skel/Desktop/Nimbalyst.desktop /config/.config/autostart/Nimbalyst.desktop' \ + 'fi' \ + '' \ + '# Repair ownership so abc can read/write (fixes root-owned regressions' \ + '# from ad-hoc `docker exec nimbalyst bash` invocations).' \ + 'chown -R abc:abc /config/.codex /config/Desktop /config/.config/autostart 2>/dev/null || true' \ + > /custom-cont-init.d/10-garza-llm && \ + chmod +x /custom-cont-init.d/10-garza-llm diff --git a/stacks/nimbalyst/README.md b/stacks/nimbalyst/README.md new file mode 100644 index 0000000..780191d --- /dev/null +++ b/stacks/nimbalyst/README.md @@ -0,0 +1,70 @@ +# Nimbalyst — Web-Desktop Cursor-Cloud-Agents Replacement + +Web-accessible XFCE desktop running [Nimbalyst](https://nimbalyst.com/) (visual workspace for +**Codex + Claude Code**), pre-wired to route Codex CLI calls through the custom ChatGPT-backed +endpoint at `https://llm.garzaos.cloud/v1`. $0 per request. + +- **URL:** https://nimbalyst.garzaos.cloud +- **Base image:** `linuxserver/webtop:ubuntu-xfce` (KasmVNC) +- **Auth:** HTTP basic-auth (Traefik middleware) +- **VPS:** primary (`${VPS_IP}`) + +## Files + +| Path | Purpose | +| --- | --- | +| `Dockerfile` | Builds `nimbalyst-local:latest` with Nimbalyst AppImage + Codex CLI + Claude Code CLI baked in | +| `docker-compose.yml` | Compose service + Traefik labels for HTTPS + basic-auth | +| `TEST-PLAN.md` | End-to-end verification plan | +| `TEST-REPORT.md` | Verification results (all 5 assertions passed) | + +## What gets baked into the image + +- **`/etc/profile.d/garza-llm.sh`** — exports `OPENAI_BASE_URL`, `OPENAI_API_KEY` for all login shells +- **`/etc/bash.bashrc`** — sources the profile script for non-login shells (fixes Codex env inheritance) +- **`/etc/skel/.codex/config.toml`** — Codex provider config (`garza` provider, `wire_api="responses"`, trusted `/config`) +- **`/usr/local/bin/gtest`** — one-shot verification helper (runs a PONG round-trip) +- **`/custom-cont-init.d/10-garza-llm`** — entrypoint hook that on every container start: + - seeds `/config/.codex/config.toml` + Nimbalyst desktop launcher if missing + - runs `chown -R abc:abc /config` so root-owned regressions from ad-hoc `docker exec` can't break Codex + +## Required secrets (do not hardcode) + +Inject via environment at `docker compose up` time, NOT at build time: + +| Variable | Description | +| --- | --- | +| `CUSTOM_LLM_API_KEY` | API key for `https://llm.garzaos.cloud/v1` (chatmock shim) | +| `NIMBALYST_BASIC_AUTH_PASSWORD` | HTTP basic-auth password for the web-desktop | + +The committed Dockerfile references these as shell-expandable placeholders — replace with a real +build-time `ENV` only when rebuilding locally on the VPS. **Never commit the actual values.** + +## Verification + +From inside the container: +```bash +gtest +``` +Expected output: +``` +OPENAI_BASE_URL=https://llm.garzaos.cloud/v1 +model: gpt-5.4 +provider: garza +codex: PONG +``` + +## How it replaces Cursor Cloud Agents + +Cursor Cloud Agents bill against the Cursor Pro subscription and refuse custom LLM endpoints. This +web-desktop gives the same "prompt → clone repo → edit → test → commit" UX, but: + +- Runs on **your** VPS (not Cursor's cloud) +- Routes every model call through **your** ChatGPT subscription via the chatmock shim +- **$0 per task** — no Cursor credits, no paid OpenAI tokens +- Both Codex CLI and (optionally) Claude Code CLI available in any terminal + +## Related services + +- `services/chatmock-shim/` — the upstream LLM proxy that this image points at +- `services/openhands-byok/` — same-pattern BYOK on OpenHands Cloud diff --git a/stacks/nimbalyst/TEST-PLAN.md b/stacks/nimbalyst/TEST-PLAN.md new file mode 100644 index 0000000..7bf4e51 --- /dev/null +++ b/stacks/nimbalyst/TEST-PLAN.md @@ -0,0 +1,47 @@ +# Nimbalyst Deployment — Test Plan (v2, post-troubleshoot) + +## Context (not a PR) +This verifies the Nimbalyst Docker deployment on the VPS: +- URL: `https://nimbalyst.garzaos.cloud` +- Auth: basic-auth `nimba` / `${NIMBALYST_BASIC_AUTH_PASSWORD}` +- Goal: confirm the web-desktop renders, Nimbalyst GUI opens, AND Codex (from a terminal in the web-desktop) routes to `https://llm.garzaos.cloud/v1` (ChatGPT-backed, $0 per request) + +## Root-cause fixes applied during troubleshooting +| Issue | Fix | +|---|---| +| env vars missing in login shells | Wrote `/etc/profile.d/garza-llm.sh` with `OPENAI_BASE_URL`/`OPENAI_API_KEY` | +| `/config/.codex` owned by root → EACCES for abc | `chown -R abc:abc /config/.codex` | +| Codex ignoring env vars (went to api.openai.com) | Created `/config/.codex/config.toml` with `model_provider="garza"` + `[model_providers.garza] base_url=https://llm.garzaos.cloud/v1 wire_api="responses"` | +| `wire_api="chat"` rejected by Codex 0.121.0 | Switched to `wire_api="responses"` (our shim supports Responses API — verified HTTP 200 on `POST /v1/responses`) | +| "Not inside trusted directory" | Pass `--skip-git-repo-check` to `codex exec` | + +## Primary flow (to execute in browser) +1. Open `https://nimbalyst.garzaos.cloud`, authenticate, wait for Selkies stream. +2. Observe the Nimbalyst Project Manager window (already auto-launched in the container). +3. In the xfce4-terminal window on the desktop, run `gtest` (a helper that sources env, cd's to `/config`, and runs `codex exec`). +4. Observe PONG in the terminal output. + +## Assertions (concrete pass/fail) +| # | Assertion | Expected | Pass criterion | +|---|-----------|----------|----------------| +| A1 | HTTPS + basic-auth gate | Unauth → 401; auth → 200 | Already verified via curl | +| A2 | Web-desktop stream renders | Selkies WebRTC delivers frames; Nimbalyst window visible | Screenshot shows `Project Manager - Nimbalyst` window | +| A3 | OPENAI_BASE_URL exported in terminal | Terminal prints `OPENAI_BASE_URL=https://llm.garzaos.cloud/v1` and `OPENAI_API_KEY=sk-garza-d284c2...` | Exact string match | +| A4 | Codex end-to-end via our shim | Terminal prints `codex` block followed by `PONG`, `provider: garza`, `model: gpt-5.4` | "PONG" appears in codex output | +| A5 | Nimbalyst window opens | XWindows tree shows `Project Manager - Nimbalyst` | Visible in screenshot from A2 | + +## Already verified via shell (not yet via browser) +- A3: `echo OPENAI_BASE_URL=$OPENAI_BASE_URL` → `https://llm.garzaos.cloud/v1` +- A4: `codex exec --skip-git-repo-check "reply with exactly the single word PONG..."` → `codex\nPONG\ntokens used 1,224\nPONG` with `provider: garza, model: gpt-5.4` +- A5: `xwininfo -root -tree` lists `Project Manager - Nimbalyst` (1100x700) + +## What could hide a broken deployment (adversarial) +- If Codex weren't really hitting our shim: A4 would fail with `api.openai.com 401` (this is what happened before we set `wire_api="responses"` + config.toml — adversarial check passed by observing that failure mode). +- If basic-auth were broken: A1 fails before anything else (Selkies 401). +- If the Nimbalyst launcher were broken: A5 fails (no window). +- If the shim proxy weren't actually routing to ChatGPT: A4 would return a stub/error; we get a real gpt-5.4 response with token count. + +## Out of scope +- Claude Code (no Anthropic-protocol endpoint; will prompt for its own key) +- WebRTC stream performance / codec settings +- Persistence of config.toml across container rebuild (will be addressed post-test as a permanent Dockerfile change) diff --git a/stacks/nimbalyst/TEST-REPORT.md b/stacks/nimbalyst/TEST-REPORT.md new file mode 100644 index 0000000..a88f704 --- /dev/null +++ b/stacks/nimbalyst/TEST-REPORT.md @@ -0,0 +1,47 @@ +# Nimbalyst Deployment — Test Report + +**URL:** https://nimbalyst.garzaos.cloud +**Creds:** `nimba` / `${NIMBALYST_BASIC_AUTH_PASSWORD}` +**Session:** https://app.devin.ai/sessions/68ec8727b8b84f5296095f7bf0155627 + +## Results + +| # | Assertion | Result | +|---|-----------|--------| +| A1 | HTTPS + basic-auth (401/200) | **passed** | +| A2 | Web-desktop stream renders | **passed** | +| A3 | `OPENAI_BASE_URL` + `OPENAI_API_KEY` set in terminal | **passed** | +| A4 | Codex CLI round-trips through our shim → ChatGPT | **passed** | +| A5 | Nimbalyst "Project Manager" window opens | **passed** | + +## Critical proof (A4) +From the xfce4 terminal on the web-desktop, `gtest` ran `codex exec "reply with exactly the single word PONG…"` and produced: +``` +model: gpt-5.4 +provider: garza +sandbox: danger-full-access +session id: 019da3c4-5755-7ed0-9d3d-cfef7fb94bc5 +user: reply with exactly the single word PONG and nothing else +codex: PONG +tokens used 71 +``` +`provider: garza` = our shim; `model: gpt-5.4` = the BYOK-routed ChatGPT model; `PONG` = real completion. End-to-end path proven: Traefik (basic-auth) → KasmVNC → xfce4-terminal → Codex CLI → `llm.garzaos.cloud/v1/responses` → ChatGPT subscription. + +## Screenshot +![Nimbalyst PONG + GUI](/home/ubuntu/nimbalyst-pong-proof.png) + +## Issues hit during execution (now fixed) +1. **env vars empty in login shells** — `/etc/bash.bashrc` isn't read by `bash -l`. Fix: wrote `/etc/profile.d/garza-llm.sh`. +2. **`/config/.codex` was root-owned** — earlier `docker exec nimbalyst bash` (no `-u abc`) created it as root → EACCES. Fix: `chown -R abc:abc /config/.codex`. +3. **Codex ignored `OPENAI_BASE_URL` env var** — it needs an explicit provider entry in `~/.codex/config.toml`. Fix: wrote config.toml with `model_provider="garza"` + `[model_providers.garza] base_url=https://llm.garzaos.cloud/v1 wire_api="responses"`. +4. **`wire_api = "chat"` rejected** by Codex 0.121.0. Fix: switched to `"responses"` — confirmed our shim implements the Responses API (`POST /v1/responses` returns HTTP 200 with a real `resp_…` object). +5. **"Not inside trusted directory"** — Codex requires a git repo or `--skip-git-repo-check`. Fix: `gtest` passes the flag. + +## Persistence note +All fixes above were applied inside the running container. They will be lost on rebuild. Two followups needed (will do **outside** test mode if you want): +- Bake `/etc/profile.d/garza-llm.sh` and `/config/.codex/config.toml` into the Dockerfile. +- Add an `entrypoint` hook that `chown -R abc:abc /config` at startup to prevent the root-owned regression. + +## Out of scope (not tested) +- Claude Code routing — we have no Anthropic-shaped shim; CLI will prompt for its own key on first use. +- XFCE desktop wallpaper/panel — dbus-login1 perms prevent xfdesktop/xfce4-panel from starting cleanly in webtop. Cosmetic only — the Nimbalyst window is visible and usable. diff --git a/stacks/nimbalyst/docker-compose.yml b/stacks/nimbalyst/docker-compose.yml new file mode 100644 index 0000000..f48425b --- /dev/null +++ b/stacks/nimbalyst/docker-compose.yml @@ -0,0 +1,24 @@ +services: + nimbalyst: + build: + context: . + dockerfile: Dockerfile + image: nimbalyst-local:latest + container_name: nimbalyst + restart: unless-stopped + security_opt: + - seccomp:unconfined + environment: + PUID: "1000" + PGID: "1000" + TZ: "America/New_York" + TITLE: "Nimbalyst (ChatGPT-backed)" + SUBFOLDER: "/" + OPENAI_BASE_URL: "https://llm.garzaos.cloud/v1" + OPENAI_API_BASE: "https://llm.garzaos.cloud/v1" + OPENAI_API_KEY: "${CUSTOM_LLM_API_KEY}" + ports: + - "127.0.0.1:13100:3000" + volumes: + - ./config:/config + shm_size: "2gb"