Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions services/chatmock-shim/INSTALL-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Install plan — `openai-oauth` + `ChatMock` on `root@2.24.201.210` (Docker)

**Target host:** `srv1589219.hstgr.cloud` / `2.24.201.210`
**State today:** Ubuntu 24.04.4, Docker 29.4.0 + Compose v5.1.2, only `traefik:latest` running (host network, HTTP→HTTPS redirect, Let's Encrypt HTTP-01 resolver named `letsencrypt`, `--providers.docker.exposedbydefault=false`). Ports 22/80/443 in use, 7.8 GB RAM, 83 GB free.

---

## 0. Read-this-first — security & policy

Both projects expose an **unauthenticated OpenAI-compatible endpoint** backed by *your personal ChatGPT OAuth tokens* (the same `auth.json` Codex uses). Anyone who can reach the endpoint gets to spend your ChatGPT rate limits.

- `openai-oauth`'s own README: *"Use only for personal, local experimentation on trusted machines; **do not run as a hosted service**, do not share access, do not pool or redistribute tokens."* Running it on a public VPS behind a domain is exactly what that line warns against and risks OpenAI rate-limiting or suspending the account.
- `ChatMock` uses the same tokens; the README is silent but the risk is identical.

**Recommendation (baked into the plan below):** do not expose `/v1/*` to the public internet unauthenticated. Pick **one** of:
1. **Private-only** — bind to `127.0.0.1`, access via SSH tunnel / Tailscale / WireGuard. *(Safest, recommended.)*
2. **Public URL + Traefik basic-auth / IP allowlist middleware** — fine for solo use, still against `openai-oauth`'s stated policy.
3. **Public + no auth** — not recommended; flagged for explicit sign-off only.

Please tell me which tier you want before I execute.

---

## 1. Scope question for you

`openai-oauth` and `ChatMock` do **the same thing** (localhost proxy → `chatgpt.com/backend-api/codex/responses` using your Codex OAuth). Choose one of:

- **A. Both side-by-side** (different ports, share the same `~/.codex/auth.json` via bind-mount). Useful for A/B testing.
- **B. ChatMock only** — it already ships a working `Dockerfile` + `docker-compose.yml` + login flow. Least work.
- **C. openai-oauth only** — newer, Bun/TS, no upstream Docker support (we'd author it).

Default in this plan = **A (both)**, with a note on what to drop if you pick B or C.

---

## 2. Layout on the VPS

```
/opt/chatgpt-proxies/
├── auth/ # shared ChatGPT OAuth credentials
│ └── auth.json # created by the login step below
├── chatmock/
│ └── docker-compose.yml # from upstream, lightly edited
│ └── .env # from .env.example
└── openai-oauth/
├── Dockerfile # authored by us (see §4)
├── docker-compose.yml # authored by us
└── .dockerignore
```

One shared `auth.json` is mounted read-only into both containers at the path each expects (`/data/auth.json` for ChatMock via `CHATGPT_LOCAL_HOME=/data`; `/root/.codex/auth.json` for openai-oauth, overridable with `--oauth-file`).

---

## 3. ChatMock (upstream Docker assets exist)

Upstream ships `Dockerfile`, `docker-compose.yml`, `DOCKER.md`, `.env.example`. Plan:

1. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock`
2. `cp .env.example .env`; set `VERBOSE=false`, keep `CHATMOCK_IMAGE=storagetime/chatmock:latest` (prebuilt) **or** switch to `build: .` to pin to the repo's own Dockerfile (safer than trusting the `storagetime/*` Docker Hub image — I recommend `build: .`).
3. **Login (one-time, interactive):**
`docker compose run --rm --service-ports chatmock-login login`
→ prints an auth URL, you paste it into a browser, complete ChatGPT login, paste the redirect URL back. Tokens are written into the `chatmock_data` volume.
4. Optionally migrate the saved token to the shared bind-mount so `openai-oauth` can reuse it:
`docker run --rm -v chatmock_data:/src -v /opt/chatgpt-proxies/auth:/dst alpine cp /src/auth.json /dst/`
5. Switch the main service to use the bind-mount instead of the named volume (edit compose):
```yaml
volumes:
- /opt/chatgpt-proxies/auth:/data
- ./prompt.md:/app/prompt.md:ro
```
6. Don't publish `8000:8000` on `0.0.0.0`. Replace with `127.0.0.1:8000:8000` (private-only) **or** drop the `ports:` block and let Traefik route to it on the default bridge using labels (see §5).
7. `docker compose up -d chatmock` and verify `curl http://127.0.0.1:8000/v1/models`.

---

## 4. openai-oauth (no upstream Docker; we author it)

Repo is a Bun/TS monorepo (`bun@1.2.18`, `turbo`, `tsup`). CLI entry: `packages/openai-oauth/src/cli.ts`, built to `dist/cli.js`, defaults to binding `127.0.0.1:10531` and reading OAuth from `~/.codex/auth.json` (overridable via `--oauth-file`, `--host`, `--port`).

**Dockerfile (authored by us, sketch):**
```dockerfile
FROM oven/bun:1.2.18-alpine AS build
WORKDIR /src
COPY . .
RUN bun install --frozen-lockfile && bun run build

FROM oven/bun:1.2.18-alpine
WORKDIR /app
COPY --from=build /src/packages/openai-oauth/dist ./dist
COPY --from=build /src/packages/openai-oauth/package.json ./package.json
COPY --from=build /src/node_modules ./node_modules
EXPOSE 10531
ENTRYPOINT ["bun", "dist/cli.js"]
CMD ["--host", "0.0.0.0", "--port", "10531", "--oauth-file", "/auth/auth.json"]
```

**docker-compose.yml:**
```yaml
services:
openai-oauth:
build: .
container_name: openai-oauth
restart: unless-stopped
volumes:
- /opt/chatgpt-proxies/auth:/auth:ro
ports:
- "127.0.0.1:10531:10531" # private-only; see §5 for Traefik variant
healthcheck:
test: ["CMD", "wget", "-qO-", "http://127.0.0.1:10531/v1/models"]
interval: 30s
timeout: 5s
retries: 3
```

**Login for openai-oauth:** the CLI intentionally does **not** ship a login flow. Options:
- (a) Reuse the `auth.json` created by the ChatMock login step (§3.4) — zero extra work.
- (b) On a machine with Node, run `npx @openai/codex login`, then `scp ~/.codex/auth.json root@2.24.201.210:/opt/chatgpt-proxies/auth/`.

---

## 5. Exposure (pick the tier from §0)

**Tier 1 — private only (recommended):**
- `ports:` are `127.0.0.1:8000:8000` and `127.0.0.1:10531:10531`.
- Access from your laptop via `ssh -L 8000:localhost:8000 -L 10531:localhost:10531 root@2.24.201.210`.
- No DNS / no Traefik route needed.

**Tier 2 — public hostname + basic-auth (solo use):**
- Decide subdomains (e.g. `chatmock.garzaos.cloud`, `openai-oauth.garzaos.cloud`) and point them at `2.24.201.210` (via Hostinger DNS / `$HOSTINGER_API_TOKEN`).
- Create a shared Docker network `web`, attach Traefik + both services to it, drop the host-network setup **or** keep Traefik on host-net and attach services to the default `bridge` (works because Traefik talks to container IPs via labels; confirmed by inspecting the existing `traefik-traefik-1` container).
- Add labels to each service:
```yaml
labels:
- traefik.enable=true
- traefik.http.routers.chatmock.rule=Host(`chatmock.garzaos.cloud`)
- traefik.http.routers.chatmock.entrypoints=websecure
- traefik.http.routers.chatmock.tls.certresolver=letsencrypt
- traefik.http.services.chatmock.loadbalancer.server.port=8000
- traefik.http.routers.chatmock.middlewares=chatmock-auth
- traefik.http.middlewares.chatmock-auth.basicauth.users=USER:$$2y$$... # htpasswd bcrypt
```
- Same pattern for `openai-oauth` on `:10531`.
- Optional hardening middleware: `ipallowlist` for your home/office IP ranges.

**Tier 3 — public + no auth:** same as tier 2, minus the `basicauth` / `ipallowlist` middlewares. Not recommended; requires explicit sign-off.

---

## 6. Concrete execution steps (once you approve a tier + scope)

1. `ssh root@2.24.201.210`
2. `mkdir -p /opt/chatgpt-proxies/{auth,chatmock,openai-oauth}`
3. `git clone https://github.com/RayBytes/ChatMock /opt/chatgpt-proxies/chatmock`
4. `git clone https://github.com/EvanZhouDev/openai-oauth /opt/chatgpt-proxies/openai-oauth-src` → write our `Dockerfile` + `docker-compose.yml` into `/opt/chatgpt-proxies/openai-oauth/` that `build:` points at the cloned source.
5. In `chatmock/`: `cp .env.example .env`, edit compose to `build: .` + bind-mount `/opt/chatgpt-proxies/auth:/data`, remove `ports:` (tier 2) or set `127.0.0.1:8000:8000` (tier 1). Add Traefik labels if tier 2.
6. `docker compose run --rm --service-ports chatmock-login login` → complete OAuth in browser → verify `/opt/chatgpt-proxies/auth/auth.json` exists.
7. (tier 2 only) `curl -u user:pass https://chatmock.garzaos.cloud/v1/models` to confirm cert issued and auth works.
8. `docker compose up -d chatmock` (in chatmock dir) and `docker compose up -d openai-oauth` (in openai-oauth dir).
9. Smoke test both:
- `curl http://127.0.0.1:8000/v1/models`
- `curl http://127.0.0.1:10531/v1/models`
- Chat completion: `curl http://127.0.0.1:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"gpt-5-codex","messages":[{"role":"user","content":"ping"}]}'`
10. Systemd isn't needed — `restart: unless-stopped` on both services is enough.

---

## 7. Day-2 ops

- **Updates:**
- ChatMock: `cd /opt/chatgpt-proxies/chatmock && git pull && docker compose build --pull && docker compose up -d`
- openai-oauth: same in its dir.
- **Token refresh:** both libraries auto-refresh using the refresh token in `auth.json`. If the ChatGPT session is forcibly signed out, re-run the ChatMock login (§3.3) — it'll rewrite `/opt/chatgpt-proxies/auth/auth.json` in place.
- **Backup:** `tar czf auth-backup.tgz /opt/chatgpt-proxies/auth` — `auth.json` is password-equivalent; store it like a secret.
- **Logs:** `docker compose logs -f chatmock` / `docker compose logs -f openai-oauth`. ChatMock has `VERBOSE=true` for deep request/stream logs.
- **Uninstall:** `docker compose down -v` in each dir and `rm -rf /opt/chatgpt-proxies`.

---

## 8. Things I need from you before executing

1. **Scope:** A (both), B (ChatMock only), or C (openai-oauth only)?
2. **Exposure tier:** 1 / 2 / 3 from §5?
3. **If tier 2:** which domain(s)? (I can auto-create the DNS records via `$HOSTINGER_API_TOKEN` once you name them, or you can point any subdomain you already own at `2.24.201.210`.)
4. **Confirm** you've read §0 and are okay with the "personal-use-only" policy trade-off of running these on a reachable server.

Once you answer those, I'll execute §6 end-to-end on the VPS and report back with working endpoints + smoke-test output.
90 changes: 90 additions & 0 deletions services/chatmock-shim/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# chatmock shim — ChatGPT OAuth proxy (`https://llm.garzaos.cloud/v1`)

OpenAI-compatible HTTP proxy that surfaces the user's interactive **ChatGPT** subscription as a
standard `/v1/chat/completions` + `/v1/responses` API. Based on the open-source
[`chatmock`](https://github.com/RayBytes/chatmock) project.

- **Endpoint:** https://llm.garzaos.cloud/v1
- **Host VPS:** Hostinger VPS `1589219` at IP `${SHIM_VPS_IP}` (separate from primary VPS)
- **Path on host:** `/opt/chatgpt-proxies/chatmock/`
- **Process:** Docker container `chatmock` fronted by Traefik

## Why this exists

Every autonomous-agent stack in this org (Nimbalyst / OpenHands / Agent Zero / Sim Studio / etc.)
accepts a custom OpenAI-compatible base URL. Pointing them at this shim routes every model call
through the user's ChatGPT Pro subscription → **$0 per request** regardless of caller.

## Custom patches applied

Two patches on top of upstream `chatmock`. Both are required for LiteLLM-based callers
(OpenHands, Agent Zero, etc.) to get working tool-call responses.

### 1. `model_registry.py` — alias non-`gpt-5*` slugs

LiteLLM auto-routes any `gpt-5*` model slug through the **Responses API** (`/v1/responses`), which
upstream ChatGPT returns as reasoning-only output (empty `output:[]`) when tools are defined.
Aliasing `chatgpt-4o` / `gpt-4o` to the upstream `gpt-5.4` lets callers pick a slug that
LiteLLM routes to `/v1/chat/completions` instead — which returns a proper message with tool calls.

```python
# /opt/chatgpt-proxies/chatmock/model_registry.py (lines 44-47)
ModelSpec(
public_id="gpt-5.4",
aliases=("gpt5.4", "gpt-5.4-latest", "chatgpt-4o", "gpt-4o", "chatgpt"),
allowed_efforts=frozenset(("none", "low", "medium", "high", "xhigh")),
```

### 2. `responses_api.py` — strip rejected params

Upstream ChatGPT Responses API rejects several OpenAI-style params that LiteLLM always sends.
Strip them server-side before forwarding.

```python
# /opt/chatgpt-proxies/chatmock/responses_api.py (lines 91-95)
normalized.pop("max_output_tokens", None)
# Strip params that OpenAI Responses API rejects for gpt-5 reasoning models
for _k in ("temperature", "top_p", "frequency_penalty", "presence_penalty",
"logit_bias", "logprobs", "top_logprobs"):
normalized.pop(_k, None)
```

## Known routing behavior

| Caller | Model slug | Path used | Works? |
|---|---|---|---|
| LiteLLM (via OpenHands) | `openai/gpt-5.4` | `/v1/responses` | No — empty output when tools defined |
| LiteLLM (via OpenHands) | `openai/chatgpt-4o` | `/v1/chat/completions` | **Yes** — this is the slug to use |
| Codex CLI (Nimbalyst) | `gpt-5.4` | `/v1/responses` | Yes — Codex handles reasoning-only output |
| Direct `curl` | any | either | Yes |

## Verification

```bash
curl -sS https://llm.garzaos.cloud/v1/chat/completions \
-H "Authorization: Bearer $CUSTOM_LLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"chatgpt-4o","messages":[{"role":"user","content":"reply PONG"}]}' \
| jq -r '.choices[0].message.content'
# Expected: PONG
```

## Install plan

See `INSTALL-PLAN.md` for the original provisioning walkthrough (OAuth setup, Traefik route,
container lifecycle).

## Required secrets

Never commit to this repo:

| Variable | Description |
|---|---|
| `CUSTOM_LLM_API_KEY` | Bearer token accepted by the shim (issued by chatmock on deploy) |
| `CHATGPT_OAUTH_REFRESH_TOKEN` | OAuth refresh token for upstream ChatGPT |
| `SHIM_VPS_IP` | IP of the Hostinger VPS hosting this shim |

## Related

- `stacks/nimbalyst/` — Codex/Claude-Code web-desktop pointed at this shim
- `services/openhands-byok/` — OpenHands Cloud BYOK config pointed at this shim
72 changes: 72 additions & 0 deletions services/openhands-byok/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# OpenHands Cloud — BYOK + MCP configuration

How OpenHands Cloud at https://app.all-hands.dev is configured to run against the chatmock shim
(ChatGPT-backed, $0/request) with three additional MCP tools.

## LLM (BYOK)

Set via **Settings → LLM → Advanced**:

| Field | Value |
|---|---|
| **Custom Model** | `openai/chatgpt-4o` |
| **Base URL** | `https://llm.garzaos.cloud/v1` |
| **API Key** | `${CUSTOM_LLM_API_KEY}` |

### Why `chatgpt-4o` and not `gpt-5.4`

LiteLLM (used by OpenHands) auto-routes any `gpt-5*` model slug through the Responses API
(`/v1/responses`). The upstream ChatGPT backend returns empty `output:[]` on that path when
tools are defined, causing OpenHands to loop with *"Your last response did not include a function
call or a message."*

The shim's `model_registry.py` was patched to accept `chatgpt-4o` as an alias → LiteLLM routes
non-`gpt-5*` slugs through `/v1/chat/completions`, which returns proper tool-call messages.

See `services/chatmock-shim/README.md` for shim details.

## MCP tools

Configured via **Settings → MCP**. All three are official upstream MCP servers.

| Server | Transport | Config |
|---|---|---|
| **E2B** (code sandbox) | `STDIO` | Command: `uvx` · Args: `e2b-mcp-server` · Env: `E2B_API_KEY=${E2B_API_KEY}` |
| **Firecrawl** (web scrape/crawl) | `SHTTP` | URL: `https://mcp.firecrawl.dev/${FIRECRAWL_API_KEY}/v2/mcp` |
| **Tavily** (AI search) | `SHTTP` | URL: `https://mcp.tavily.com/mcp/?tavilyApiKey=${TAVILY_API_KEY}` |

Pre-existing MCP servers (left in place):
- `garza-mcp-unified` (SHTTP) — GARZA OS unified MCP gateway
- `rube.app/mcp` (SHTTP) — rube.app toolbox

## Required secrets (do not commit)

| Variable | Source |
|---|---|
| `CUSTOM_LLM_API_KEY` | chatmock shim (Bearer token) |
| `E2B_API_KEY` | https://e2b.dev/dashboard?tab=keys |
| `FIRECRAWL_API_KEY` | https://www.firecrawl.dev/app/api-keys |
| `TAVILY_API_KEY` | https://app.tavily.com/home |

All four keys are recoverable from `/opt/surfsense/.env` on the primary VPS.

## Files

| Path | Purpose |
|---|---|
| `TEST-PLAN.md` | End-to-end BYOK verification plan (PONG round-trip, 4 assertions) |
| `TEST-REPORT.md` | Verification results — all 4 assertions passed |

## Verification (LLM path only)

Start a new conversation at https://app.all-hands.dev and send:

> Reply with the single word PONG and nothing else. Do not use any tools.

Expected: reply contains `PONG`, model badge shows `openai/chatgpt-4o`, shim logs on the shim VPS
show `POST /v1/chat/completions HTTP/1.1 200` at the reply timestamp.

## Related

- `services/chatmock-shim/` — the LLM proxy OpenHands points at
- `stacks/nimbalyst/` — same BYOK pattern for Codex CLI on a self-hosted web-desktop
24 changes: 24 additions & 0 deletions services/openhands-byok/TEST-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# OpenHands Cloud BYOK — End-to-end Test Plan (v3)

## What changed since v2
- Shim `model_registry.py`: added `chatgpt-4o` as an alias of upstream `gpt-5.4`. Container rebuilt + restarted. Direct curl with `model=chatgpt-4o` + function tools returns `content:"PONG"` (no empty output).
- OpenHands Cloud → Settings → LLM (Advanced) → **Custom Model** changed from `openai/gpt-5.4` → `openai/chatgpt-4o`. Settings saved.
- Rationale: LiteLLM auto-routes `gpt-5*` slugs through `/v1/responses` (Responses API). A non-`gpt-5*` slug like `chatgpt-4o` routes through `/v1/chat/completions`, which the shim has always handled correctly.

## Primary flow
1. Start a new conversation on app.all-hands.dev (existing ones may be pinned to old settings per "restart to see changes" toast).
2. Send prompt: `Reply with the single word PONG and nothing else. Do not use any tools.`
3. Wait ≤ 90 s.

## Key assertions
- **A1** — No LiteLLM error card (`BadRequestError`, `Unsupported parameter`, `APIConnectionError`, `AuthenticationError`) appears in the conversation.
- **A2** — Agent produces a final assistant reply whose visible text contains `PONG` within 90 s. NO looping `"Your last response did not include a function call or a message"` errors.
- **A3** — No OpenHands-credits / trial / "configure LLM" banner/toast. (Would indicate BYOK was silently bypassed.)
- **A4** — `chatmock` container on VPS `2.24.201.210` logs show `POST /v1/chat/completions` (NOT `/v1/responses`) with `model: chatgpt-4o` at the timestamp of the reply.

## Out-of-scope
- Tool-use capability (the prompt explicitly asks for no tools). A follow-up test with an actual tool task can come later.
- Multi-turn reasoning. This is the minimum sanity check that BYOK routing works end-to-end.

## Pass/fail
Pass = A1 + A2 + A3 + A4 all true. Anything else = fail; report cause.
Loading