diff --git a/README.md b/README.md
index ce60a9b..143faa6 100644
--- a/README.md
+++ b/README.md
@@ -51,9 +51,22 @@ bun run claude-harness/index.ts --file prompt.md
 ```bash
 bun run codex-harness/index.ts "Build a personal task manager with a REST API, interactive dashboard with charts, task categories, priority levels, due dates, and search functionality"
 ```
-
 Both harnesses write their output to `workspace/claude/` and `workspace/codex/` respectively. The built application lives in `workspace/{sdk}/app/`.
 
+### Run the Mixed Harness (Claude generates, GPT-5.4 evaluates)
+
+```bash
+bun run mixed-harness/index.ts "Build a REST API with authentication"
+```
+
+### Run the Gemini Harness (Claude generates, Gemini 3.1 Pro evaluates)
+
+```bash
+GEMINI_API_KEY=your-key bun run gemini-harness/index.ts --file prompt.md
+```
+
+Mixed and Gemini harnesses write to `workspace/mixed/` and `workspace/gemini/` respectively. Set `HARNESS_LOG_DIR` to customize where conversation logs are saved (defaults to `./logs`).
+
 ## Configuration
 
 Defaults are in `shared/config.ts`:
@@ -63,7 +76,7 @@ Defaults are in `shared/config.ts`:
 | `maxSprints` | 10 | Maximum number of sprints |
 | `maxRetriesPerSprint` | 3 | Max evaluation retries before failing a sprint |
 | `passThreshold` | 7 | Minimum score (out of 10) for each criterion |
-| `CLAUDE_MODEL` | `claude-sonnet-4-6` | Model for Claude harness |
+| `CLAUDE_MODEL` | `claude-opus-4-6` | Model for Claude harness |
 | `CODEX_MODEL` | `gpt-5.4` | Model for Codex harness |
 
 ## How It Works
@@ -180,9 +193,18 @@ adversarial-dev/
 │   ├── planner.ts       # Planner agent
 │   ├── generator.ts     # Generator agent
 │   └── evaluator.ts     # Evaluator agent
+├── mixed-harness/      # Claude generator + Codex GPT-5.4 evaluator
+│   ├── index.ts, harness.ts, planner.ts, generator.ts, evaluator.ts
+├── gemini-harness/     # Claude generator + Gemini 3.1 Pro evaluator (sandboxed tools)
+│   ├── index.ts, harness.ts, planner.ts, generator.ts, evaluator.ts
+├── tests/              # Test suites (29 tests)
+│   ├── mixed-harness.test.ts
+│   └── conversation-logger.test.ts
 └── workspace/           # Runtime output (gitignored)
-    ├── claude/          # Claude harness working directory
-    └── codex/           # Codex harness working directory
+    ├── claude/
+    ├── codex/
+    ├── mixed/
+    └── gemini/
 ```
 
 Both harnesses share the same prompts, types, and orchestration flow. The only differences are the SDK-specific agent implementations -- `query()` async generators for Claude, `Codex` threads for Codex.
diff --git a/RESULTS.md b/RESULTS.md
new file mode 100644
index 0000000..4858a24
--- /dev/null
+++ b/RESULTS.md
@@ -0,0 +1,96 @@
+# Battle Report: First Multi-Model Harness Runs
+
+**Date:** 2026-04-02
+**Target:** brane-code SSE streaming fix (Codex proxy buffering bug)
+**Prompt:** `brane-streaming-fix.md` — wire SSE streaming into brane-code's Codex proxy
+
+## Scoreboard
+
+| Harness | Generator | Evaluator | Result | Time | Sprints | Attempts |
+|---------|-----------|-----------|--------|------|---------|----------|
+| claude-harness | Claude Opus 4.6 | Claude Opus 4.6 | **5/5 PASSED** | 53.4 min | 5 | 6 total (S3 needed 2) |
+| codex-harness | GPT-5.4 | GPT-5.4 | **0/1 FAILED** | 59.6 min | 0 of 1 | 4 (all failed) |
+| mixed-harness | Claude Opus 4.6 | GPT-5.4 | **In Progress** | 60+ min | Sprint 4 (11/13 passing) | Still running |
+| gemini-harness | Claude Opus 4.6 | Gemini 3.1 Pro | **5/5 PASSED** | 50.7 min | 5 | 6 total (S4 needed 2) |
+
+## Key Findings
+
+### Claude vs Itself (5/5 PASSED, 53.4 min)
+
+Self-evaluation with the same model. Opus generated and evaluated its own code. All 5 sprints passed, with Sprint 3 needing a retry. Final sprint covered performance criteria (constant memory, no event loop blocking, time-to-first-token) — all passed with 7-8/10 scores.
+
+**Concern:** Self-evaluation may be sycophantic. The same model may not catch its own blind spots.
+
+### Codex Alone (0/1 FAILED, 59.6 min)
+
+GPT-5.4 as both generator and evaluator. **Never wrote any code.** After 4 attempts, Sprint 1 scored 1/10 on all 15 criteria. The evaluator found:
+
+> "No incremental stream path exists in the submitted app."
+> "The app cannot be started, so first-token behavior is untestable."
+> "Starting the expected app entrypoint failed with MODULE_NOT_FOUND."
+
+**Root cause:** Codex CLI has context window limits (~272K), sandbox constraints (~200 files, ~10MB), and an auto-compression bug that causes it to lose context on long-running tasks. It couldn't sustain multi-sprint autonomous coding.
+
+### Mixed: Opus Generates, GPT-5.4 Evaluates (Sprint 4, 11/13)
+
+The adversarial matchup. Claude Opus generates code, GPT-5.4 rips it apart. This is where the GAN-inspired approach shines — zero sycophancy.
+
+Sprint 4 (attempt 2) scored 11/13 criteria passing. GPT-5.4 caught real issues:
+
+- **`http_401_403_token_refresh` (5/10):** "Only partially implemented. Expected one real OAuth refresh and one retry with a refreshed token."
+- **`repl_token_display_nonzero` (2/10):** "This fails in the shipped app. After a completed streamed response, the REPL/CLI display shows zero tokens."
+
+These are legitimate bugs that the Claude self-evaluation missed entirely.
+
+### Gemini Evaluates Opus (5/5 PASSED, 50.7 min)
+
+Claude Opus generates, Gemini 3.1 Pro evaluates using tool calling (readFile, runCommand, listFiles). Gemini was thorough — it ran the test suite (`bun test`: 132 pass, 26 pass across multiple suites) and read source files before scoring.
+
+Sprint 4 needed 2 attempts. Sprint 5 scored perfect 10/10 across all 14 criteria. Gemini's evaluations were detailed and evidence-based, citing specific file paths and line numbers.
+
+**Notable:** Gemini was a tougher evaluator than Claude self-eval (Sprint 4 failed first attempt) but more generous on final scores (10/10 vs 7-8/10). Different evaluation style — more binary pass/fail thinking.
+
+## What the Adversarial Approach Caught
+
+Bugs found by cross-model evaluation that self-evaluation missed:
+
+1. **OAuth token refresh incomplete** — GPT-5.4 evaluator flagged partial implementation
+2. **REPL token display showing zero** — GPT-5.4 caught display layer bug
+3. **Division-by-zero in renegotiation logic** — CodeRabbit CLI review
+4. **Dead branch in renegotiation** — CodeRabbit CLI review
+5. **Symlink sandbox escape** — Code review agent
+6. **`node -e` arbitrary code execution** — Security review agent
+7. **`find -exec` subprocess spawn** — Security review agent
+
+## Structural Improvements Made
+
+Before running the harnesses, we hardened the original codebase:
+
+1. **Iterative contract negotiation** — 3 rounds of generator/evaluator back-and-forth instead of single-shot
+2. **Fail-closed contract parsing** — Throws on malformed JSON instead of falling back to defaults
+3. **Mid-sprint renegotiation** — Triggers when avgScore < 4 or all criteria failing
+4. **Gemini evaluator sandbox** — Command allowlisting, path confinement with realpath(), git read-only, find -exec blocking
+
+## Harness Architecture
+
+```
+                    Planner (Claude Opus)
+                         |
+                    spec.md (product spec)
+                         |
+              Contract Negotiation (3 rounds)
+                    /          \
+            Generator           Evaluator
+         (Claude Opus)       (varies by harness)
+              |                    |
+         builds code          scores 1-10
+              |                    |
+              +--- retry loop -----+
+                  (max 3 attempts per sprint)
+```
+
+## Test Coverage
+
+29 tests across 2 suites:
+- `mixed-harness.test.ts` — 22 tests (parseContract, renegotiation triggers, parseEvalResult, negotiation rounds)
+- `conversation-logger.test.ts` — 7 tests (entry logging, markdown format, JSONL validity, disk save)
diff --git a/bun.lock b/bun.lock
index 8e4935d..f1af5c7 100644
--- a/bun.lock
+++ b/bun.lock
@@ -6,6 +6,7 @@
       "name": "adversarial-dev",
       "dependencies": {
         "@anthropic-ai/claude-agent-sdk": "^0.2.85",
+        "@google/genai": "^1.48.0",
         "@openai/codex-sdk": "^0.117.0",
       },
       "devDependencies": {
@@ -19,6 +20,8 @@
   "packages": {
     "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.2.85", "", { "optionalDependencies": { "@img/sharp-darwin-arm64": "^0.34.2", "@img/sharp-darwin-x64": "^0.34.2", "@img/sharp-linux-arm": "^0.34.2", "@img/sharp-linux-arm64": "^0.34.2", "@img/sharp-linux-x64": "^0.34.2", "@img/sharp-linuxmusl-arm64": "^0.34.2", "@img/sharp-linuxmusl-x64": "^0.34.2", "@img/sharp-win32-arm64": "^0.34.2", "@img/sharp-win32-x64": "^0.34.2" }, "peerDependencies": { "zod": "^4.0.0" } }, "sha512-/ohKLtP1zy6aWXLW/9KTYBveJPEtAfdO96qiP1Cl5S7LgVq/qRDUl7AUw5YGrBaK6YWHEE/rfMQZGwP/i5zIvQ=="],
 
+    "@google/genai": ["@google/genai@1.48.0", "", { "dependencies": { "google-auth-library": "^10.3.0", "p-retry": "^4.6.2", "protobufjs": "^7.5.4", "ws": "^8.18.0" }, "peerDependencies": { "@modelcontextprotocol/sdk": "^1.25.2" }, "optionalPeers": ["@modelcontextprotocol/sdk"] }, "sha512-plonYK4ML2PrxsRD9SeqmFt76eREWkQdPCglOA6aYDzL1AAbE+7PUnT54SvpWGfws13L0AZEqGSpL7+1IPnTxQ=="],
+
     "@img/sharp-darwin-arm64": ["@img/sharp-darwin-arm64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-arm64": "1.2.4" }, "os": "darwin", "cpu": "arm64" }, "sha512-imtQ3WMJXbMY4fxb/Ndp6HBTNVtWCUI0WdobyheGf5+ad6xX8VIDO8u2xE4qc/fr08CKG/7dDseFtn6M6g/r3w=="],
 
     "@img/sharp-darwin-x64": ["@img/sharp-darwin-x64@0.34.5", "", { "optionalDependencies": { "@img/sharp-libvips-darwin-x64": "1.2.4" }, "os": "darwin", "cpu": "x64" }, "sha512-YNEFAF/4KQ/PeW0N+r+aVVsoIY0/qxxikF2SWdp+NRkmMB7y9LBZAVqQ4yhGCm/H3H270OSykqmQMKLBhBJDEw=="],
@@ -67,16 +70,94 @@
 
     "@openai/codex-win32-x64": ["@openai/codex@0.117.0-win32-x64", "", { "os": "win32", "cpu": "x64" }, "sha512-ByedNwSlHJ4aE2++fBaUcaqbQsmx2dZS6mhrnv2SqbTY0saRFE2BT1R64fClt8TwXwMsQQn1uvkxjzU4aEhRcg=="],
 
+    "@protobufjs/aspromise": ["@protobufjs/aspromise@1.1.2", "", {}, "sha512-j+gKExEuLmKwvz3OgROXtrJ2UG2x8Ch2YZUxahh+s1F2HZ+wAceUNLkvy6zKCPVRkU++ZWQrdxsUeQXmcg4uoQ=="],
+
+    "@protobufjs/base64": ["@protobufjs/base64@1.1.2", "", {}, "sha512-AZkcAA5vnN/v4PDqKyMR5lx7hZttPDgClv83E//FMNhR2TMcLUhfRUBHCmSl0oi9zMgDDqRUJkSxO3wm85+XLg=="],
+
+    "@protobufjs/codegen": ["@protobufjs/codegen@2.0.4", "", {}, "sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg=="],
+
+    "@protobufjs/eventemitter": ["@protobufjs/eventemitter@1.1.0", "", {}, "sha512-j9ednRT81vYJ9OfVuXG6ERSTdEL1xVsNgqpkxMsbIabzSo3goCjDIveeGv5d03om39ML71RdmrGNjG5SReBP/Q=="],
+
+    "@protobufjs/fetch": ["@protobufjs/fetch@1.1.0", "", { "dependencies": { "@protobufjs/aspromise": "^1.1.1", "@protobufjs/inquire": "^1.1.0" } }, "sha512-lljVXpqXebpsijW71PZaCYeIcE5on1w5DlQy5WH6GLbFryLUrBD4932W/E2BSpfRJWseIL4v/KPgBFxDOIdKpQ=="],
+
+    "@protobufjs/float": ["@protobufjs/float@1.0.2", "", {}, "sha512-Ddb+kVXlXst9d+R9PfTIxh1EdNkgoRe5tOX6t01f1lYWOvJnSPDBlG241QLzcyPdoNTsblLUdujGSE4RzrTZGQ=="],
+
+    "@protobufjs/inquire": ["@protobufjs/inquire@1.1.0", "", {}, "sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q=="],
+
+    "@protobufjs/path": ["@protobufjs/path@1.1.2", "", {}, "sha512-6JOcJ5Tm08dOHAbdR3GrvP+yUUfkjG5ePsHYczMFLq3ZmMkAD98cDgcT2iA1lJ9NVwFd4tH/iSSoe44YWkltEA=="],
+
+    "@protobufjs/pool": ["@protobufjs/pool@1.1.0", "", {}, "sha512-0kELaGSIDBKvcgS4zkjz1PeddatrjYcmMWOlAuAPwAeccUrPHdUqo/J6LiymHHEiJT5NrF1UVwxY14f+fy4WQw=="],
+
+    "@protobufjs/utf8": ["@protobufjs/utf8@1.1.0", "", {}, "sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw=="],
+
     "@types/bun": ["@types/bun@1.3.11", "", { "dependencies": { "bun-types": "1.3.11" } }, "sha512-5vPne5QvtpjGpsGYXiFyycfpDF2ECyPcTSsFBMa0fraoxiQyMJ3SmuQIGhzPg2WJuWxVBoxWJ2kClYTcw/4fAg=="],
 
     "@types/node": ["@types/node@25.5.0", "", { "dependencies": { "undici-types": "~7.18.0" } }, "sha512-jp2P3tQMSxWugkCUKLRPVUpGaL5MVFwF8RDuSRztfwgN1wmqJeMSbKlnEtQqU8UrhTmzEmZdu2I6v2dpp7XIxw=="],
 
+    "@types/retry": ["@types/retry@0.12.0", "", {}, "sha512-wWKOClTTiizcZhXnPY4wikVAwmdYHp8q6DmC+EJUzAMsycb7HB32Kh9RN4+0gExjmPmZSAQjgURXIGATPegAvA=="],
+
+    "agent-base": ["agent-base@7.1.4", "", {}, "sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ=="],
+
+    "base64-js": ["base64-js@1.5.1", "", {}, "sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA=="],
+
+    "bignumber.js": ["bignumber.js@9.3.1", "", {}, "sha512-Ko0uX15oIUS7wJ3Rb30Fs6SkVbLmPBAKdlm7q9+ak9bbIeFf0MwuBsQV6z7+X768/cHsfg+WlysDWJcmthjsjQ=="],
+
+    "buffer-equal-constant-time": ["buffer-equal-constant-time@1.0.1", "", {}, "sha512-zRpUiDwd/xk6ADqPMATG8vc9VPrkck7T07OIx0gnjmJAnHnTVXNQG3vfvWNuiZIkwu9KrKdA1iJKfsfTVxE6NA=="],
+
     "bun-types": ["bun-types@1.3.11", "", { "dependencies": { "@types/node": "*" } }, "sha512-1KGPpoxQWl9f6wcZh57LvrPIInQMn2TQ7jsgxqpRzg+l0QPOFvJVH7HmvHo/AiPgwXy+/Thf6Ov3EdVn1vOabg=="],
 
+    "data-uri-to-buffer": ["data-uri-to-buffer@4.0.1", "", {}, "sha512-0R9ikRb668HB7QDxT1vkpuUBtqc53YyAwMwGeUFKRojY/NWKvdZ+9UYtRfGmhqNbRkTSVpMbmyhXipFFv2cb/A=="],
+
+    "debug": ["debug@4.4.3", "", { "dependencies": { "ms": "^2.1.3" } }, "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA=="],
+
+    "ecdsa-sig-formatter": ["ecdsa-sig-formatter@1.0.11", "", { "dependencies": { "safe-buffer": "^5.0.1" } }, "sha512-nagl3RYrbNv6kQkeJIpt6NJZy8twLB/2vtz6yN9Z4vRKHN4/QZJIEbqohALSgwKdnksuY3k5Addp5lg8sVoVcQ=="],
+
+    "extend": ["extend@3.0.2", "", {}, "sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g=="],
+
+    "fetch-blob": ["fetch-blob@3.2.0", "", { "dependencies": { "node-domexception": "^1.0.0", "web-streams-polyfill": "^3.0.3" } }, "sha512-7yAQpD2UMJzLi1Dqv7qFYnPbaPx7ZfFK6PiIxQ4PfkGPyNyl2Ugx+a/umUonmKqjhM4DnfbMvdX6otXq83soQQ=="],
+
+    "formdata-polyfill": ["formdata-polyfill@4.0.10", "", { "dependencies": { "fetch-blob": "^3.1.2" } }, "sha512-buewHzMvYL29jdeQTVILecSaZKnt/RJWjoZCF5OW60Z67/GmSLBkOFM7qh1PI3zFNtJbaZL5eQu1vLfazOwj4g=="],
+
+    "gaxios": ["gaxios@7.1.4", "", { "dependencies": { "extend": "^3.0.2", "https-proxy-agent": "^7.0.1", "node-fetch": "^3.3.2" } }, "sha512-bTIgTsM2bWn3XklZISBTQX7ZSddGW+IO3bMdGaemHZ3tbqExMENHLx6kKZ/KlejgrMtj8q7wBItt51yegqalrA=="],
+
+    "gcp-metadata": ["gcp-metadata@8.1.2", "", { "dependencies": { "gaxios": "^7.0.0", "google-logging-utils": "^1.0.0", "json-bigint": "^1.0.0" } }, "sha512-zV/5HKTfCeKWnxG0Dmrw51hEWFGfcF2xiXqcA3+J90WDuP0SvoiSO5ORvcBsifmx/FoIjgQN3oNOGaQ5PhLFkg=="],
+
+    "google-auth-library": ["google-auth-library@10.6.2", "", { "dependencies": { "base64-js": "^1.3.0", "ecdsa-sig-formatter": "^1.0.11", "gaxios": "^7.1.4", "gcp-metadata": "8.1.2", "google-logging-utils": "1.1.3", "jws": "^4.0.0" } }, "sha512-e27Z6EThmVNNvtYASwQxose/G57rkRuaRbQyxM2bvYLLX/GqWZ5chWq2EBoUchJbCc57eC9ArzO5wMsEmWftCw=="],
+
+    "google-logging-utils": ["google-logging-utils@1.1.3", "", {}, "sha512-eAmLkjDjAFCVXg7A1unxHsLf961m6y17QFqXqAXGj/gVkKFrEICfStRfwUlGNfeCEjNRa32JEWOUTlYXPyyKvA=="],
+
+    "https-proxy-agent": ["https-proxy-agent@7.0.6", "", { "dependencies": { "agent-base": "^7.1.2", "debug": "4" } }, "sha512-vK9P5/iUfdl95AI+JVyUuIcVtd4ofvtrOr3HNtM2yxC9bnMbEdp3x01OhQNnjb8IJYi38VlTE3mBXwcfvywuSw=="],
+
+    "json-bigint": ["json-bigint@1.0.0", "", { "dependencies": { "bignumber.js": "^9.0.0" } }, "sha512-SiPv/8VpZuWbvLSMtTDU8hEfrZWg/mH/nV/b4o0CYbSxu1UIQPLdwKOCIyLQX+VIPO5vrLX3i8qtqFyhdPSUSQ=="],
+
+    "jwa": ["jwa@2.0.1", "", { "dependencies": { "buffer-equal-constant-time": "^1.0.1", "ecdsa-sig-formatter": "1.0.11", "safe-buffer": "^5.0.1" } }, "sha512-hRF04fqJIP8Abbkq5NKGN0Bbr3JxlQ+qhZufXVr0DvujKy93ZCbXZMHDL4EOtodSbCWxOqR8MS1tXA5hwqCXDg=="],
+
+    "jws": ["jws@4.0.1", "", { "dependencies": { "jwa": "^2.0.1", "safe-buffer": "^5.0.1" } }, "sha512-EKI/M/yqPncGUUh44xz0PxSidXFr/+r0pA70+gIYhjv+et7yxM+s29Y+VGDkovRofQem0fs7Uvf4+YmAdyRduA=="],
+
+    "long": ["long@5.3.2", "", {}, "sha512-mNAgZ1GmyNhD7AuqnTG3/VQ26o760+ZYBPKjPvugO8+nLbYfX6TVpJPseBvopbdY+qpZ/lKUnmEc1LeZYS3QAA=="],
+
+    "ms": ["ms@2.1.3", "", {}, "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="],
+
+    "node-domexception": ["node-domexception@1.0.0", "", {}, "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ=="],
+
+    "node-fetch": ["node-fetch@3.3.2", "", { "dependencies": { "data-uri-to-buffer": "^4.0.0", "fetch-blob": "^3.1.4", "formdata-polyfill": "^4.0.10" } }, "sha512-dRB78srN/l6gqWulah9SrxeYnxeddIG30+GOqK/9OlLVyLg3HPnr6SqOWTWOXKRwC2eGYCkZ59NNuSgvSrpgOA=="],
+
+    "p-retry": ["p-retry@4.6.2", "", { "dependencies": { "@types/retry": "0.12.0", "retry": "^0.13.1" } }, "sha512-312Id396EbJdvRONlngUx0NydfrIQ5lsYu0znKVUzVvArzEIt08V1qhtyESbGVd1FGX7UKtiFp5uwKZdM8wIuQ=="],
+
+    "protobufjs": ["protobufjs@7.5.4", "", { "dependencies": { "@protobufjs/aspromise": "^1.1.2", "@protobufjs/base64": "^1.1.2", "@protobufjs/codegen": "^2.0.4", "@protobufjs/eventemitter": "^1.1.0", "@protobufjs/fetch": "^1.1.0", "@protobufjs/float": "^1.0.2", "@protobufjs/inquire": "^1.1.0", "@protobufjs/path": "^1.1.2", "@protobufjs/pool": "^1.1.0", "@protobufjs/utf8": "^1.1.0", "@types/node": ">=13.7.0", "long": "^5.0.0" } }, "sha512-CvexbZtbov6jW2eXAvLukXjXUW1TzFaivC46BpWc/3BpcCysb5Vffu+B3XHMm8lVEuy2Mm4XGex8hBSg1yapPg=="],
+
+    "retry": ["retry@0.13.1", "", {}, "sha512-XQBQ3I8W1Cge0Seh+6gjj03LbmRFWuoszgK9ooCpwYIrhhoO80pfq4cUkU5DkknwfOfFteRwlZ56PYOGYyFWdg=="],
+
+    "safe-buffer": ["safe-buffer@5.2.1", "", {}, "sha512-rp3So07KcdmmKbGvgaNxQSJr7bGVSVk5S9Eq1F+ppbRo70+YeaDxkw5Dd8NPN+GD6bjnYm2VuPuCXmpuYvmCXQ=="],
+
     "typescript": ["typescript@6.0.2", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-bGdAIrZ0wiGDo5l8c++HWtbaNCWTS4UTv7RaTH/ThVIgjkveJt83m74bBHMJkuCbslY8ixgLBVZJIOiQlQTjfQ=="],
 
     "undici-types": ["undici-types@7.18.2", "", {}, "sha512-AsuCzffGHJybSaRrmr5eHr81mwJU3kjw6M+uprWvCXiNeN9SOGwQ3Jn8jb8m3Z6izVgknn1R0FTCEAP2QrLY/w=="],
 
+    "web-streams-polyfill": ["web-streams-polyfill@3.3.3", "", {}, "sha512-d2JWLCivmZYTSIoge9MsgFCZrt571BikcWGYkjC1khllbTeDlGqZ2D8vD8E/lJa8WGWbb7Plm8/XJYV7IJHZZw=="],
+
+    "ws": ["ws@8.20.0", "", { "peerDependencies": { "bufferutil": "^4.0.1", "utf-8-validate": ">=5.0.2" }, "optionalPeers": ["bufferutil", "utf-8-validate"] }, "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA=="],
+
     "zod": ["zod@4.3.6", "", {}, "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg=="],
   }
 }
diff --git a/claude-harness/evaluator.ts b/claude-harness/evaluator.ts
index 77730b2..ad36293 100644
--- a/claude-harness/evaluator.ts
+++ b/claude-harness/evaluator.ts
@@ -93,7 +93,7 @@ function parseEvalResult(
   for (const candidate of candidates) {
     try {
       const parsed = JSON.parse(candidate) as EvalResult;
-      if (parsed.feedback && Array.isArray(parsed.feedback)) {
+      if (parsed.feedback && Array.isArray(parsed.feedback) && parsed.feedback.length > 0) {
         // Recalculate passed based on threshold
         parsed.passed = parsed.feedback.every((f) => f.score >= passThreshold);
         return parsed;
diff --git a/claude-harness/harness.ts b/claude-harness/harness.ts
index d947913..dac4093 100644
--- a/claude-harness/harness.ts
+++ b/claude-harness/harness.ts
@@ -10,7 +10,6 @@ import {
   writeSpec,
   readSpec,
   writeContract,
-  readContract,
   writeFeedback,
   writeProgress,
 } from "../shared/files.ts";
@@ -87,7 +86,22 @@ export async function runHarness(config: HarnessConfig): Promise<HarnessResult>
     await writeProgress(config.workDir, progress);
 
     log("HARNESS", "Negotiating sprint contract...");
-    const contract = await negotiateContract(config.workDir, spec, sprint);
+    let contract: SprintContract;
+    let negotiationAttempts = 0;
+    const maxNegotiationAttempts = 2;
+    while (true) {
+      try {
+        contract = await negotiateContract(config.workDir, spec, sprint);
+        break;
+      } catch (e) {
+        negotiationAttempts++;
+        if (negotiationAttempts >= maxNegotiationAttempts) {
+          logError("HARNESS", `Contract negotiation failed after ${negotiationAttempts} attempts: ${e}`);
+          throw e;
+        }
+        log("HARNESS", `Contract negotiation produced invalid output, retrying (${negotiationAttempts}/${maxNegotiationAttempts})...`);
+      }
+    }
     await writeContract(config.workDir, contract);
     log("HARNESS", `Contract agreed: ${contract.criteria.length} criteria for ${contract.features.length} features`);
 
@@ -121,6 +135,28 @@ export async function runHarness(config: HarnessConfig): Promise<HarnessResult>
 
       if (retry < config.maxRetriesPerSprint) {
         log("HARNESS", `Sprint ${sprint} failed attempt ${attempts}, retrying...`);
+
+        // Check if we should renegotiate criteria
+          if (retry >= 1 && lastEval && lastEval.feedback.length > 0) {
+            const avgScore = lastEval.feedback.reduce((sum, f) => sum + f.score, 0) / lastEval.feedback.length;
+            const allFailing = lastEval.feedback.every(f => f.score < (contract.criteria.find(c => c.name === f.criterion)?.threshold ?? 7));
+
+            // Renegotiate if average score is very low or all criteria are failing
+            if (allFailing || avgScore < 4) {
+              if (allFailing) {
+                log("HARNESS", `All criteria failing (avg score: ${avgScore.toFixed(1)}), renegotiating contract...`);
+              } else {
+                log("HARNESS", `Low average score (${avgScore.toFixed(1)}), renegotiating contract...`);
+              }
+              try {
+                contract = await negotiateContract(config.workDir, spec, sprint);
+                await writeContract(config.workDir, contract);
+                log("HARNESS", `Renegotiated contract: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+              } catch (e) {
+                logError("HARNESS", `Renegotiation failed, continuing with current contract: ${e}`);
+              }
+            }
+        }
       } else {
         logError("HARNESS", `Sprint ${sprint} FAILED after ${attempts} attempts`);
       }
@@ -161,60 +197,97 @@ async function negotiateContract(
   spec: string,
   sprintNumber: number,
 ): Promise<SprintContract> {
-  // Generator proposes contract
-  const proposalPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
-
-  const proposalOptions: Options = {
-    cwd: workDir,
-    systemPrompt: CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
-    permissionMode: "bypassPermissions",
-    allowDangerouslySkipPermissions: true,
-    tools: ["Read"],
-    model: CLAUDE_MODEL,
-    maxTurns: 10,
-    persistSession: false,
-  };
-
+  const maxRounds = 3;
+  let round = 0;
   let proposalText = "";
-  for await (const msg of query({ prompt: proposalPrompt, options: proposalOptions })) {
-    if (msg.type === "assistant") {
-      const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
-      for (const block of message.message.content) {
-        if (block.type === "text" && block.text) {
-          proposalText += block.text;
+  let reviewText = "";
+  let approved = false;
+
+  while (round < maxRounds && !approved) {
+    round++;
+    log("HARNESS", `Contract negotiation round ${round}/${maxRounds}`);
+
+    // Generator proposes or counter-proposes
+    let generatorPrompt: string;
+    if (round === 1) {
+      // First round: initial proposal
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
+    } else {
+      // Subsequent rounds: counter-propose based on evaluator feedback
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\n## Evaluator Feedback\n\nThe evaluator reviewed the contract and provided this feedback:\n\n${reviewText}\n\nPlease revise the contract based on this feedback. If the evaluator approved, output "APPROVED". Otherwise, output a revised contract.`;
+    }
+
+    const proposalOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    proposalText = "";
+    for await (const msg of query({ prompt: generatorPrompt, options: proposalOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            proposalText += block.text;
+          }
         }
       }
     }
-  }
 
-  // Evaluator reviews contract
-  const reviewPrompt = `## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
-
-  const reviewOptions: Options = {
-    cwd: workDir,
-    systemPrompt: CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
-    permissionMode: "bypassPermissions",
-    allowDangerouslySkipPermissions: true,
-    tools: ["Read"],
-    model: CLAUDE_MODEL,
-    maxTurns: 10,
-    persistSession: false,
-  };
+    // Check if generator approved (only in subsequent rounds)
+    if (round > 1 && proposalText.trim() === "APPROVED") {
+      approved = true;
+      log("HARNESS", "Generator accepted evaluator revisions, contract finalized");
+      break;
+    }
 
-  let reviewText = "";
-  for await (const msg of query({ prompt: reviewPrompt, options: reviewOptions })) {
-    if (msg.type === "assistant") {
-      const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
-      for (const block of message.message.content) {
-        if (block.type === "text" && block.text) {
-          reviewText += block.text;
+    // Evaluator reviews contract
+    const reviewPrompt = `## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
+
+    const reviewOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    reviewText = "";
+    for await (const msg of query({ prompt: reviewPrompt, options: reviewOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            reviewText += block.text;
+          }
         }
       }
     }
+
+    // Check if evaluator approved
+    if (reviewText.trim().toUpperCase().startsWith("APPROVED")) {
+      approved = true;
+      log("HARNESS", `Contract approved by evaluator in round ${round}`);
+      break;
+    }
+
+    // If not approved and we have reached max rounds, take evaluator version as final
+    if (round >= maxRounds) {
+      log("HARNESS", `Max negotiation rounds (${maxRounds}) reached, using evaluator version`);
+    }
   }
 
-  // Parse the final contract (either the proposal if approved, or the revised version)
-  const contractSource = reviewText.trim() === "APPROVED" ? proposalText : reviewText;
+  // Parse the final contract (either proposal if approved, or evaluator version)
+  const contractSource = reviewText.trim().toUpperCase().startsWith("APPROVED") ? proposalText : reviewText;
   return parseContract(contractSource, sprintNumber);
 }
 
@@ -241,28 +314,5 @@ function parseContract(text: string, sprintNumber: number): SprintContract {
     }
   }
 
-  {
-    logError("HARNESS", "Failed to parse contract JSON, creating default");
-    return {
-      sprintNumber,
-      features: [`Sprint ${sprintNumber} features`],
-      criteria: [
-        {
-          name: "basic_functionality",
-          description: "Core features for this sprint are implemented and working",
-          threshold: 7,
-        },
-        {
-          name: "code_quality",
-          description: "Code is clean, well-structured, and follows best practices",
-          threshold: 7,
-        },
-        {
-          name: "error_handling",
-          description: "Errors are handled gracefully with appropriate user feedback",
-          threshold: 7,
-        },
-      ],
-    };
-  }
+  throw new Error(`Contract negotiation produced unparseable output. Raw text: ${text.slice(0, 200)}`);
 }
diff --git a/codex-harness/evaluator.ts b/codex-harness/evaluator.ts
index 4ae40fb..a935f8d 100644
--- a/codex-harness/evaluator.ts
+++ b/codex-harness/evaluator.ts
@@ -79,7 +79,7 @@ function parseEvalResult(
   for (const candidate of candidates) {
     try {
       const parsed = JSON.parse(candidate) as EvalResult;
-      if (parsed.feedback && Array.isArray(parsed.feedback)) {
+      if (parsed.feedback && Array.isArray(parsed.feedback) && parsed.feedback.length > 0) {
         parsed.passed = parsed.feedback.every((f) => f.score >= passThreshold);
         return parsed;
       }
diff --git a/codex-harness/harness.ts b/codex-harness/harness.ts
index 105c64d..53ef923 100644
--- a/codex-harness/harness.ts
+++ b/codex-harness/harness.ts
@@ -86,7 +86,22 @@ export async function runHarness(config: HarnessConfig): Promise<HarnessResult>
     await writeProgress(config.workDir, progress);
 
     log("HARNESS", "Negotiating sprint contract...");
-    const contract = await negotiateContract(config.workDir, spec, sprint);
+    let contract: SprintContract;
+    let negotiationAttempts = 0;
+    const maxNegotiationAttempts = 2;
+    while (true) {
+      try {
+        contract = await negotiateContract(config.workDir, spec, sprint);
+        break;
+      } catch (e) {
+        negotiationAttempts++;
+        if (negotiationAttempts >= maxNegotiationAttempts) {
+          logError("HARNESS", `Contract negotiation failed after ${negotiationAttempts} attempts: ${e}`);
+          throw e;
+        }
+        log("HARNESS", `Contract negotiation produced invalid output, retrying (${negotiationAttempts}/${maxNegotiationAttempts})...`);
+      }
+    }
     await writeContract(config.workDir, contract);
     log("HARNESS", `Contract agreed: ${contract.criteria.length} criteria for ${contract.features.length} features`);
 
@@ -120,6 +135,28 @@ export async function runHarness(config: HarnessConfig): Promise<HarnessResult>
 
       if (retry < config.maxRetriesPerSprint) {
         log("HARNESS", `Sprint ${sprint} failed attempt ${attempts}, retrying...`);
+
+        // Check if we should renegotiate criteria
+          if (retry >= 1 && lastEval && lastEval.feedback.length > 0) {
+            const avgScore = lastEval.feedback.reduce((sum, f) => sum + f.score, 0) / lastEval.feedback.length;
+            const allFailing = lastEval.feedback.every(f => f.score < (contract.criteria.find(c => c.name === f.criterion)?.threshold ?? 7));
+
+            // Renegotiate if average score is very low or all criteria are failing
+            if (allFailing || avgScore < 4) {
+              if (allFailing) {
+                log("HARNESS", `All criteria failing (avg score: ${avgScore.toFixed(1)}), renegotiating contract...`);
+              } else {
+                log("HARNESS", `Low average score (${avgScore.toFixed(1)}), renegotiating contract...`);
+              }
+              try {
+                contract = await negotiateContract(config.workDir, spec, sprint);
+                await writeContract(config.workDir, contract);
+                log("HARNESS", `Renegotiated contract: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+              } catch (e) {
+                logError("HARNESS", `Renegotiation failed, continuing with current contract: ${e}`);
+              }
+            }
+        }
       } else {
         logError("HARNESS", `Sprint ${sprint} FAILED after ${attempts} attempts`);
       }
@@ -159,37 +196,74 @@ async function negotiateContract(
   spec: string,
   sprintNumber: number,
 ): Promise<SprintContract> {
-  const codex = new Codex();
+  const maxRounds = 3;
+  let round = 0;
+  let proposalText = "";
+  let reviewText = "";
+  let approved = false;
+
+  while (round < maxRounds && !approved) {
+    round++;
+    log("HARNESS", `Contract negotiation round ${round}/${maxRounds}`);
+
+    const codex = new Codex();
+
+    // Generator proposes or counter-proposes
+    let generatorPrompt: string;
+    if (round === 1) {
+      // First round: initial proposal
+      generatorPrompt = `${CONTRACT_NEGOTIATION_GENERATOR_PROMPT}\n\n---\n\n## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
+    } else {
+      // Subsequent rounds: counter-propose based on evaluator feedback
+      generatorPrompt = `${CONTRACT_NEGOTIATION_GENERATOR_PROMPT}\n\n---\n\n## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\n## Evaluator Feedback\n\nThe evaluator reviewed the contract and provided this feedback:\n\n${reviewText}\n\nPlease revise the contract based on this feedback. If the evaluator approved, output "APPROVED". Otherwise, output a revised contract.`;
+    }
 
-  // Generator proposes contract
-  const proposalPrompt = `${CONTRACT_NEGOTIATION_GENERATOR_PROMPT}\n\n---\n\n## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
+    const proposalThread = codex.startThread({
+      workingDirectory: workDir,
+      sandboxMode: "danger-full-access",
+      networkAccessEnabled: CODEX_NETWORK_ACCESS,
+      approvalPolicy: "never",
+      model: CODEX_MODEL,
+    });
 
-  const proposalThread = codex.startThread({
-    workingDirectory: workDir,
-    sandboxMode: "danger-full-access",
-    networkAccessEnabled: CODEX_NETWORK_ACCESS,
-    approvalPolicy: "never",
-    model: CODEX_MODEL,
-  });
+    const proposalTurn = await proposalThread.run(generatorPrompt);
+    proposalText = proposalTurn.finalResponse ?? "";
 
-  const proposalTurn = await proposalThread.run(proposalPrompt);
-  const proposalText = proposalTurn.finalResponse ?? "";
+    // Check if generator approved (only in subsequent rounds)
+    if (round > 1 && proposalText.trim() === "APPROVED") {
+      approved = true;
+      log("HARNESS", "Generator accepted evaluator revisions, contract finalized");
+      break;
+    }
 
-  // Evaluator reviews contract
-  const reviewPrompt = `${CONTRACT_NEGOTIATION_EVALUATOR_PROMPT}\n\n---\n\n## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
+    // Evaluator reviews contract
+    const reviewPrompt = `${CONTRACT_NEGOTIATION_EVALUATOR_PROMPT}\n\n---\n\n## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
 
-  const reviewThread = codex.startThread({
-    workingDirectory: workDir,
-    sandboxMode: "danger-full-access",
-    networkAccessEnabled: CODEX_NETWORK_ACCESS,
-    approvalPolicy: "never",
-    model: CODEX_MODEL,
-  });
+    const reviewThread = codex.startThread({
+      workingDirectory: workDir,
+      sandboxMode: "danger-full-access",
+      networkAccessEnabled: CODEX_NETWORK_ACCESS,
+      approvalPolicy: "never",
+      model: CODEX_MODEL,
+    });
+
+    const reviewTurn = await reviewThread.run(reviewPrompt);
+    reviewText = reviewTurn.finalResponse ?? "";
+
+    // Check if evaluator approved
+    if (reviewText.trim().toUpperCase().startsWith("APPROVED")) {
+      approved = true;
+      log("HARNESS", `Contract approved by evaluator in round ${round}`);
+      break;
+    }
 
-  const reviewTurn = await reviewThread.run(reviewPrompt);
-  const reviewText = reviewTurn.finalResponse ?? "";
+    // If not approved and we have reached max rounds, take evaluator version as final
+    if (round >= maxRounds) {
+      log("HARNESS", `Max negotiation rounds (${maxRounds}) reached, using evaluator version`);
+    }
+  }
 
-  const contractSource = reviewText.trim() === "APPROVED" ? proposalText : reviewText;
+  const contractSource = reviewText.trim().toUpperCase().startsWith("APPROVED") ? proposalText : reviewText;
   return parseContract(contractSource, sprintNumber);
 }
 
@@ -216,28 +290,5 @@ function parseContract(text: string, sprintNumber: number): SprintContract {
     }
   }
 
-  {
-    logError("HARNESS", "Failed to parse contract JSON, creating default");
-    return {
-      sprintNumber,
-      features: [`Sprint ${sprintNumber} features`],
-      criteria: [
-        {
-          name: "basic_functionality",
-          description: "Core features for this sprint are implemented and working",
-          threshold: 7,
-        },
-        {
-          name: "code_quality",
-          description: "Code is clean, well-structured, and follows best practices",
-          threshold: 7,
-        },
-        {
-          name: "error_handling",
-          description: "Errors are handled gracefully with appropriate user feedback",
-          threshold: 7,
-        },
-      ],
-    };
-  }
+  throw new Error(`Contract negotiation produced unparseable output. Raw text: ${text.slice(0, 200)}`);
 }
diff --git a/examples/gemini-run-excerpt.md b/examples/gemini-run-excerpt.md
new file mode 100644
index 0000000..a1eadda
--- /dev/null
+++ b/examples/gemini-run-excerpt.md
@@ -0,0 +1,150 @@
+# Adversarial Dev Harness — Conversation Log
+Session: 2026-04-02T14-19-12
+Duration: 50.7 minutes
+Entries: 675
+
+---
+
+### ⚙️ HARNESS (system) — 💬 System
+*14:19:12*
+
+Gemini Harness started — Generator: Claude (claude-opus-4-6), Evaluator: Gemini (gemini-3.1-pro-preview)
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — **→ Prompt**
+*14:19:12*
+
+<details>
+<summary>Show full content (3748 chars)</summary>
+
+IMPORTANT: Your working directory is /home/player3vsgpt/Desktop/Projects/adversarial-dev-hardening/workspace/gemini. All files you create (including spec.md) MUST be written inside this directory. Do NOT write files anywhere else.
+
+## Task: Wire SSE Streaming into brane-code's Codex Proxy
+
+brane-code is a fork of Claude Code that routes API calls to OpenAI's Codex/GPT-5.4 backend. The REPL works and first message round-trip works, but second+ messages hang because the response is buffered instead of streamed.
+
+### The Bug
+
+`src/providers/openai/index.ts` function `sendToCodex()` at line 79 does:
+```typescript
+const body = await resp.text()  // BUFFERS ENTIRE SSE RESPONSE
+```
+
+This needs to become a streaming SSE parser that reads `resp.body` as a ReadableStream and yields Anthropic-shaped stream events (`message_start`, `content_block_delta`, `message_stop`) as SSE chunks arrive from the Codex API.
+
+### What the Codex API Returns (SSE format)
+
+POST to `https://chatgpt.com/backend-api/codex/responses` with `stream: true` returns SSE lines:
+- `data: {"type": "response.output_text.delta", "delta": "chunk of text"}`
+- `data: {"type": "response.completed", "response": {"output_text": "full text"}}`
+- `data: [DONE]`
+
+### What queryModel Expects (Anthropic stream events)
+
+`src/services/api/claude.ts` at line 1991 does `for await (const part of stream)` where each `part` is a `BetaRawMessageStreamEvent`:
+- `{type: 'message_start', message: {id, model, role, usage}}`
+- `{type: 'content_block_start', index: 0, content_block: {type: 'text', text: ''}}`
+- `{type: 'content_block_delta', index: 0, delta: {type: 'text_delta', text: 'chunk'}}`
+- `{type: 'content_block_stop', index: 0}`
+- `{type: 'message_delta', delta: {stop_reason: 'end_turn'}, usage: {output_tokens: N}}`
+- `{type: 'message_stop'}`
+
+### Implementation Plan
+
+1. Create `src/providers/openai/stream.ts` (~120 lines) — SSE parser that yields Anthropic-shaped events
+2. Wire into `src/services/api/claude.ts` at the OpenAI intercept point (~line 1820) — replace the `sendToCodex()` call with `streamFromCodex()` that yields into the existing stream processing loop
+3. Remove the dead code: the `sendToCodex()` call in claude.ts that passes phantom `apiKey`/`accountId` params
+4. Remove the redundant JS Proxy intercept in `src/services/api/client.ts` if the queryModel-level intercept handles everything
+5. Test: verify multi-turn conversation works, tokens stream in real-time, second message doesn't hang
+
+### Reference: Nightfox streaming (READ ONLY — do not modify)
+
+Nightfox at `/home/player3vsgpt/Desktop/Projects/nightfox/src/providers/openai-provider.ts` lines 330-390 already handles OpenAI streaming:
+- Uses `@openai/agents` SDK `run()` with `stream: true`
+- Iterates with `for await (const event of streamed)`
+- Event type `raw_model_stream_event` with `event.data.type === 'output_text_delta'` and `event.data.delta` for text chunks
+- Lines 687-714: `handleStreamEvent()` extracts text deltas
+
+### Key Files in This Repo
+
+- `src/providers/openai/index.ts` — current sendToCodex() that needs streaming
+- `src/providers/openai/auth.ts` — OAuth token management (getValidToken)
+- `src/services/api/claude.ts` — queryModel() stream consumer (~line 1820-2060)
+- `src/services/api/client.ts` — JS Proxy intercept (~line 101-151)
+- `src/main.tsx` — BRANE_OPENAI_MODE fast-path (~line 2250)
+
+### Tech Stack
+
+- TypeScript, Bun runtime
+- No new dependencies — use native fetch + ReadableStream
+- Keep Anthropic SDK types as internal wire format
+
+### What NOT To Do
+
+- Do NOT modify Nightfox files — read-only reference
+- Do NOT add `@openai/agents` as a dependency
+- Do NOT change the Anthropic SDK type system
+- Do NOT touch auth logic — OAuth refresh already works
+
+</details>
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — 🔧 Tool Call
+*14:19:19*
+> Tool: `Read`
+
+<details>
+<summary>Read input</summary>
+
+```
+{"file_path":"/home/player3vsgpt/Desktop/Projects/adversarial-dev-hardening/workspace/gemini/src/providers/openai/index.ts"}
+```
+</details>
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — 🔧 Tool Call
+*14:19:20*
+> Tool: `Read`
+
+<details>
+<summary>Read input</summary>
+
+```
+{"file_path":"/home/player3vsgpt/Desktop/Projects/adversarial-dev-hardening/workspace/gemini/src/services/api/claude.ts","offset":1800,"limit":280}
+```
+</details>
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — 🔧 Tool Call
+*14:19:21*
+> Tool: `Read`
+
+<details>
+<summary>Read input</summary>
+
+```
+{"file_path":"/home/player3vsgpt/Desktop/Projects/adversarial-dev-hardening/workspace/gemini/src/services/api/client.ts","offset":90,"limit":80}
+```
+</details>
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — 🔧 Tool Call
+*14:19:22*
+> Tool: `Read`
+
+<details>
+<summary>Read input</summary>
+
+```
+{"file_path":"/home/player3vsgpt/Desktop/Projects/nightfox/src/providers/openai-provider.ts","offset":320,"limit":80}
+```
+</details>
+
+---
+
+### 📋 PLANNER (claude-opus-4-6) — 🔧 Tool Call
diff --git a/gemini-harness/evaluator.ts b/gemini-harness/evaluator.ts
new file mode 100644
index 0000000..8674652
--- /dev/null
+++ b/gemini-harness/evaluator.ts
@@ -0,0 +1,258 @@
+import { GoogleGenAI, Type } from "@google/genai";
+import type { Tool } from "@google/genai";
+import { readFile, readdir, realpath } from "fs/promises";
+import { execFileSync } from "child_process";
+import { resolve } from "path";
+import { EVALUATOR_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { GEMINI_MODEL, GEMINI_API_KEY } from "../shared/config.ts";
+import { log, logError } from "../shared/logger.ts";
+import type { SprintContract, EvalResult } from "../shared/types.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runEvaluator(
+  workDir: string,
+  contract: SprintContract,
+  passThreshold: number,
+  clog: ConversationLogger,
+): Promise<EvalResult> {
+  const sprint = contract.sprintNumber;
+  log("EVALUATOR", `[Gemini/${GEMINI_MODEL}] Evaluating sprint ${sprint} against ${contract.criteria.length} criteria`);
+
+  const taskPrompt = `## Sprint Contract to Evaluate Against
+
+${JSON.stringify(contract, null, 2)}
+
+## Pass Threshold
+
+Each criterion must score at least ${passThreshold}/10 to pass.
+
+## Instructions
+
+Examine the application in the \`app/\` directory. Read the code, run it if possible, and score each criterion. Output ONLY the JSON evaluation object.`;
+
+  clog.prompt("EVALUATOR", GEMINI_MODEL, taskPrompt, { sprint });
+
+  const startMs = Date.now();
+
+  const ai = new GoogleGenAI({ apiKey: GEMINI_API_KEY });
+
+  const tools: Tool[] = [
+    {
+      functionDeclarations: [
+        {
+          name: "readFile",
+          description: "Read a file from the workspace",
+          parameters: {
+            type: Type.OBJECT,
+            properties: {
+              path: { type: Type.STRING, description: "Path to the file to read" },
+            },
+            required: ["path"],
+          },
+        },
+        {
+          name: "runCommand",
+          description: "Run a shell command in the workspace",
+          parameters: {
+            type: Type.OBJECT,
+            properties: {
+              command: { type: Type.STRING, description: "Shell command to execute" },
+            },
+            required: ["command"],
+          },
+        },
+        {
+          name: "listFiles",
+          description: "List files in a directory",
+          parameters: {
+            type: Type.OBJECT,
+            properties: {
+              path: { type: Type.STRING, description: "Directory path to list" },
+            },
+            required: ["path"],
+          },
+        },
+      ],
+    },
+  ];
+
+  const chat = ai.chats.create({
+    model: GEMINI_MODEL,
+    config: {
+      systemInstruction: EVALUATOR_SYSTEM_PROMPT,
+      tools,
+    },
+  });
+
+  let response = await chat.sendMessage({ message: taskPrompt });
+
+  // Handle tool calls in a loop
+  while (response.functionCalls && response.functionCalls.length > 0) {
+    const toolResults: Array<{ functionResponse: { id: string; name: string; response: { result: string } } }> = [];
+
+    for (const call of response.functionCalls) {
+      let result: string;
+      const callName = call.name ?? "";
+      const callArgs = (call.args ?? {}) as Record<string, string>;
+
+      log("EVALUATOR", `  Tool: ${callName}`);
+      clog.toolCall("EVALUATOR", GEMINI_MODEL, callName, JSON.stringify(callArgs).slice(0, 500));
+
+        if (callName === "readFile") {
+          const filePath = resolve(workDir, callArgs.path ?? "");
+          try {
+            const real = await realpath(filePath);
+            if (!real.startsWith(await realpath(workDir))) {
+              result = "Error: path outside workspace";
+            } else {
+              result = await readFile(real, "utf-8");
+            }
+          } catch (err) {
+            result = `Error reading file: ${err instanceof Error ? err.message : String(err)}`;
+          }
+        } else if (callName === "runCommand") {
+          // Read-only commands safe for evaluation. No code execution (node/bun/npm).
+          const ALLOWED_CMDS = ["ls", "cat", "grep", "head", "tail", "wc", "diff", "find", "tsc"];
+          const GIT_READ_ONLY = ["log", "status", "diff", "show", "ls-files", "rev-parse"];
+          const command = callArgs.command ?? "";
+          const parts = command.trim().split(/\s+/);
+          const bin = parts[0] ?? "";
+          const args = parts.slice(1);
+
+          // Special handling for git: only allow read-only subcommands
+          if (bin === "git") {
+            const subCmd = args[0] ?? "";
+            if (!GIT_READ_ONLY.includes(subCmd)) {
+              result = `Error: git subcommand '${subCmd}' not allowed. Allowed: ${GIT_READ_ONLY.join(", ")}`;
+            } else {
+              try {
+                result = execFileSync("git", args, { cwd: workDir, timeout: 30000 }).toString();
+              } catch (err) {
+                result = `Command error: ${err instanceof Error ? err.message : String(err)}`;
+              }
+            }
+          } else if (!ALLOWED_CMDS.includes(bin)) {
+            result = `Error: command '${bin}' not allowed. Allowed: git, ${ALLOWED_CMDS.join(", ")}`;
+          } else {
+            // Block dangerous flags: find -exec/-execdir, and absolute paths outside workspace
+            const BLOCKED_FLAGS = ["-exec", "-execdir", "-delete", "-fls", "-fprint"];
+            const resolvedWorkDir = resolve(workDir);
+            const hasDangerousFlag = args.some(a => BLOCKED_FLAGS.includes(a));
+            const hasEscape = args.some(a => a.startsWith("/") && !a.startsWith(resolvedWorkDir));
+            if (hasDangerousFlag) {
+              result = "Error: dangerous flag detected (e.g. -exec). Not allowed in sandbox.";
+            } else if (hasEscape) {
+              result = "Error: command arguments reference paths outside workspace";
+            } else {
+              try {
+                result = execFileSync(bin, args, { cwd: workDir, timeout: 30000 }).toString();
+              } catch (err) {
+                result = `Command error: ${err instanceof Error ? err.message : String(err)}`;
+              }
+            }
+          }
+        } else if (callName === "listFiles") {
+          const dirPath = resolve(workDir, callArgs.path ?? ".");
+          try {
+            const real = await realpath(dirPath);
+            if (!real.startsWith(await realpath(workDir))) {
+              result = "Error: path outside workspace";
+            } else {
+              const entries = await readdir(real, { recursive: true });
+              result = entries.slice(0, 50).join("\n");
+            }
+          } catch (err) {
+            result = `Error listing files: ${err instanceof Error ? err.message : String(err)}`;
+          }
+        } else {
+          result = `Unknown tool: ${callName}`;
+        }
+
+      clog.toolResult("EVALUATOR", GEMINI_MODEL, callName, result.slice(0, 500));
+
+      toolResults.push({
+        functionResponse: {
+          id: call.id ?? callName,
+          name: callName,
+          response: { result },
+        },
+      });
+    }
+
+    // Send tool results back
+    response = await chat.sendMessage({
+      message: toolResults.map((r) => ({ functionResponse: r.functionResponse })),
+    });
+  }
+
+  // Extract final text response
+  const evaluationText = response.text ?? "";
+  const durationMs = Date.now() - startMs;
+
+  log("EVALUATOR", `Evaluation complete for sprint ${sprint}`);
+
+  const evalResult = parseEvalResult(evaluationText, contract, passThreshold);
+
+  // Build scores map for logging
+  const scoresMap: Record<string, number> = {};
+  for (const f of evalResult.feedback) {
+    scoresMap[f.criterion] = f.score;
+  }
+
+  clog.response("EVALUATOR", GEMINI_MODEL, evaluationText, {
+    sprint,
+    duration_ms: durationMs,
+    scores: scoresMap,
+  });
+
+  const passedCount = evalResult.feedback.filter((f) => f.score >= passThreshold).length;
+  const totalCount = evalResult.feedback.length;
+  const verdict = evalResult.passed ? "PASSED" : "FAILED";
+  log("EVALUATOR", `Sprint ${sprint}: ${verdict} (${passedCount}/${totalCount} criteria passed)`);
+
+  for (const item of evalResult.feedback) {
+    const status = item.score >= passThreshold ? "\x1b[32mPASS\x1b[0m" : "\x1b[31mFAIL\x1b[0m";
+    log("EVALUATOR", `  [${status}] ${item.criterion}: ${item.score}/10 - ${item.details.slice(0, 100)}`);
+  }
+
+  return evalResult;
+}
+
+function parseEvalResult(
+  response: string,
+  contract: SprintContract,
+  passThreshold: number,
+): EvalResult {
+  const candidates: string[] = [];
+  const codeBlocks = [...response.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+  for (const match of codeBlocks.reverse()) {
+    if (match[1]) candidates.push(match[1].trim());
+  }
+  const braceMatch = response.match(/\{[\s\S]*"passed"[\s\S]*"feedback"[\s\S]*\}/);
+  if (braceMatch) candidates.push(braceMatch[0]);
+  candidates.push(response.trim());
+
+  for (const candidate of candidates) {
+    try {
+      const parsed = JSON.parse(candidate) as EvalResult;
+      if (parsed.feedback && Array.isArray(parsed.feedback) && parsed.feedback.length > 0) {
+        parsed.passed = parsed.feedback.every((f) => f.score >= passThreshold);
+        return parsed;
+      }
+    } catch {
+      // Try next candidate
+    }
+  }
+
+  logError("EVALUATOR", "Failed to parse evaluation JSON from any extraction strategy");
+  return {
+    passed: false,
+    scores: {},
+    feedback: contract.criteria.map((c) => ({
+      criterion: c.name,
+      score: 0,
+      details: "Evaluator failed to produce parseable output",
+    })),
+    overallSummary: "Evaluation parsing failed. Raw response: " + response.slice(0, 500),
+  };
+}
diff --git a/gemini-harness/generator.ts b/gemini-harness/generator.ts
new file mode 100644
index 0000000..5b1a5ba
--- /dev/null
+++ b/gemini-harness/generator.ts
@@ -0,0 +1,74 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import { GENERATOR_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { CLAUDE_MODEL, CLAUDE_MAX_TURNS } from "../shared/config.ts";
+import { log } from "../shared/logger.ts";
+import type { SprintContract, EvalResult } from "../shared/types.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runGenerator(
+  workDir: string,
+  spec: string,
+  contract: SprintContract,
+  clog: ConversationLogger,
+  previousFeedback?: EvalResult,
+): Promise<{ response: string; sessionId?: string }> {
+  const sprint = contract.sprintNumber;
+  const attempt = previousFeedback ? "retry" : "initial";
+  log("GENERATOR", `[Claude/${CLAUDE_MODEL}] Sprint ${sprint} (${attempt}) - Building: ${contract.features.join(", ")}`);
+
+  let prompt = `IMPORTANT: Your working directory is ${workDir}. All code MUST be created inside ${workDir}/app/. Do NOT create files outside of ${workDir}.\n\n## Product Spec\n\n${spec}\n\n## Sprint Contract\n\n${JSON.stringify(contract, null, 2)}`;
+
+  if (previousFeedback) {
+    prompt += `\n\n## Evaluation Feedback (MUST ADDRESS)\n\n${JSON.stringify(previousFeedback, null, 2)}`;
+    prompt += `\n\nThe previous attempt failed evaluation. Address every issue in the feedback above.`;
+  } else {
+    prompt += `\n\nImplement the features listed in this sprint contract. Work in the \`app/\` directory.`;
+  }
+
+  clog.prompt("GENERATOR", CLAUDE_MODEL, prompt, { sprint, attempt: previousFeedback ? 2 : 1 });
+
+  const options: Options = {
+    cwd: workDir,
+    systemPrompt: GENERATOR_SYSTEM_PROMPT,
+    permissionMode: "bypassPermissions",
+    allowDangerouslySkipPermissions: true,
+    tools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
+    model: CLAUDE_MODEL,
+    maxTurns: CLAUDE_MAX_TURNS,
+    persistSession: true,
+  };
+
+  let fullResponse = "";
+  let sessionId: string | undefined;
+  const startMs = Date.now();
+
+  for await (const msg of query({ prompt, options })) {
+    if (msg.type === "assistant") {
+      const message = msg as { message: { content: Array<{ type: string; text?: string; name?: string; input?: any }> } };
+      for (const block of message.message.content) {
+        if (block.type === "text" && block.text) {
+          fullResponse += block.text;
+        } else if (block.type === "tool_use" && block.name) {
+          log("GENERATOR", `  Tool: ${block.name}`);
+          clog.toolCall("GENERATOR", CLAUDE_MODEL, block.name, JSON.stringify(block.input ?? {}).slice(0, 500));
+        }
+      }
+    } else if (msg.type === "result") {
+      const result = msg as { session_id?: string };
+      sessionId = result.session_id;
+      log("GENERATOR", `Sprint ${sprint} build complete (session: ${sessionId?.slice(0, 8)}...)`);
+    }
+  }
+
+  const durationMs = Date.now() - startMs;
+  clog.response("GENERATOR", CLAUDE_MODEL, fullResponse || "(tools only, no text output)", {
+    sprint,
+    duration_ms: durationMs,
+  });
+
+  if (!fullResponse) {
+    log("GENERATOR", `Sprint ${sprint} completed (agent used tools only, no text output)`);
+  }
+
+  return { response: fullResponse, sessionId };
+}
diff --git a/gemini-harness/harness.ts b/gemini-harness/harness.ts
new file mode 100644
index 0000000..9bafb4b
--- /dev/null
+++ b/gemini-harness/harness.ts
@@ -0,0 +1,327 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import {
+  CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
+  CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
+} from "../shared/prompts.ts";
+import { CLAUDE_MODEL, GEMINI_MODEL } from "../shared/config.ts";
+import { log, logError, logDivider } from "../shared/logger.ts";
+import { ConversationLogger } from "../shared/conversation-logger.ts";
+import {
+  initWorkspace,
+  writeSpec,
+  readSpec,
+  writeContract,
+  writeFeedback,
+  writeProgress,
+} from "../shared/files.ts";
+import type {
+  HarnessConfig,
+  SprintContract,
+  EvalResult,
+  HarnessProgress,
+  SprintResult,
+  HarnessResult,
+} from "../shared/types.ts";
+import { runPlanner } from "./planner.ts";
+import { runGenerator } from "./generator.ts";
+import { runEvaluator } from "./evaluator.ts";
+
+export async function runHarness(config: HarnessConfig & { logDir?: string }): Promise<HarnessResult> {
+  const startTime = Date.now();
+  const results: SprintResult[] = [];
+  const logDir = config.logDir || "./logs";
+  const clog = new ConversationLogger(logDir);
+  clog.system(`Gemini Harness started — Generator: Claude (${CLAUDE_MODEL}), Evaluator: Gemini (${GEMINI_MODEL})`);
+  log("HARNESS", "ADVERSARIAL DEV - Gemini Harness (Claude Generator + Gemini Evaluator)");
+  log("HARNESS", `Work directory: ${config.workDir}`);
+  log("HARNESS", `Max sprints: ${config.maxSprints} | Max retries: ${config.maxRetriesPerSprint} | Threshold: ${config.passThreshold}/10`);
+
+  await initWorkspace(config.workDir);
+
+  // Phase 1: Planning (Claude)
+  logDivider();
+  log("HARNESS", "PHASE 1: PLANNING (Claude Opus)");
+  logDivider();
+
+  const progress: HarnessProgress = {
+    status: "planning",
+    currentSprint: 0,
+    totalSprints: 0,
+    completedSprints: 0,
+    retryCount: 0,
+  };
+  await writeProgress(config.workDir, progress);
+
+  const plannerResponse = await runPlanner(config.userPrompt, config.workDir, clog);
+
+  let spec: string;
+  try {
+    spec = await readSpec(config.workDir);
+  } catch {
+    log("HARNESS", "Planner returned spec as text, writing to spec.md");
+    await writeSpec(config.workDir, plannerResponse);
+    spec = plannerResponse;
+  }
+
+  // Parse sprint count from spec
+  const sprintNumbers = Array.from(spec.matchAll(/sprint\s+(\d+)/gi))
+    .map((m) => parseInt(m[1]!, 10))
+    .filter((n) => n > 0 && n <= config.maxSprints);
+  const totalSprints = sprintNumbers.length > 0
+    ? Math.min(Math.max(...sprintNumbers), config.maxSprints)
+    : 3;
+
+  progress.totalSprints = totalSprints;
+  log("HARNESS", `Planner produced ${totalSprints} sprints`);
+
+  // Phase 2-4: Sprint Loop
+  for (let sprint = 1; sprint <= totalSprints; sprint++) {
+    logDivider();
+    log("HARNESS", `SPRINT ${sprint}/${totalSprints}`);
+    logDivider();
+
+    // Phase 2: Contract Negotiation (Claude proposes, Claude reviews — same model for contract alignment)
+    progress.status = "negotiating";
+    progress.currentSprint = sprint;
+    progress.retryCount = 0;
+    await writeProgress(config.workDir, progress);
+
+    log("HARNESS", "Negotiating sprint contract...");
+    let contract: SprintContract;
+    let negotiationAttempts = 0;
+    const maxNegotiationAttempts = 2;
+    while (true) {
+      try {
+        contract = await negotiateContract(config.workDir, spec, sprint, clog);
+        break;
+      } catch (e) {
+        negotiationAttempts++;
+        if (negotiationAttempts >= maxNegotiationAttempts) {
+          logError("HARNESS", `Contract negotiation failed after ${negotiationAttempts} attempts: ${e}`);
+          throw e;
+        }
+        log("HARNESS", `Contract negotiation produced invalid output, retrying (${negotiationAttempts}/${maxNegotiationAttempts})...`);
+      }
+    }
+    await writeContract(config.workDir, contract);
+    log("HARNESS", `Contract agreed: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+
+    // Phase 3-4: Build (Claude) -> Evaluate (Gemini) Loop
+    let passed = false;
+    let lastEval: EvalResult | undefined;
+    let attempts = 0;
+
+    for (let retry = 0; retry <= config.maxRetriesPerSprint; retry++) {
+      attempts = retry + 1;
+
+      // Build (Claude Opus)
+      log("HARNESS", `--- BUILD ATTEMPT ${attempts} (Claude Opus) ---`);
+      progress.status = "building";
+      progress.retryCount = retry;
+      await writeProgress(config.workDir, progress);
+
+      await runGenerator(config.workDir, spec, contract, clog, lastEval);
+
+      // Evaluate (Gemini 3.1 Pro)
+      log("HARNESS", `--- EVALUATION (Gemini Evaluator) ---`);
+      progress.status = "evaluating";
+      await writeProgress(config.workDir, progress);
+
+      lastEval = await runEvaluator(config.workDir, contract, config.passThreshold, clog);
+      await writeFeedback(config.workDir, sprint, retry, lastEval);
+
+      if (lastEval.passed) {
+        passed = true;
+        log("HARNESS", `Sprint ${sprint} PASSED on attempt ${attempts}`);
+        break;
+      }
+
+      if (retry < config.maxRetriesPerSprint) {
+        log("HARNESS", `Sprint ${sprint} failed attempt ${attempts}, retrying...`);
+
+        // Check if we should renegotiate criteria
+        if (retry >= 1 && lastEval && lastEval.feedback.length > 0) {
+          const avgScore = lastEval.feedback.reduce((sum, f) => sum + f.score, 0) / lastEval.feedback.length;
+          const allFailing = lastEval.feedback.every(f => f.score < (contract.criteria.find(c => c.name === f.criterion)?.threshold ?? 7));
+
+          // Renegotiate if average score is very low or all criteria are failing
+          if (allFailing || avgScore < 4) {
+            if (allFailing) {
+              log("HARNESS", `All criteria failing (avg score: ${avgScore.toFixed(1)}), renegotiating contract...`);
+            } else {
+              log("HARNESS", `Low average score (${avgScore.toFixed(1)}), renegotiating contract...`);
+            }
+            try {
+              contract = await negotiateContract(config.workDir, spec, sprint, clog);
+              await writeContract(config.workDir, contract);
+              log("HARNESS", `Renegotiated contract: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+            } catch (e) {
+              logError("HARNESS", `Renegotiation failed, continuing with current contract: ${e}`);
+            }
+          }
+        }
+      } else {
+        logError("HARNESS", `Sprint ${sprint} FAILED after ${attempts} attempts`);
+      }
+    }
+
+    results.push({
+      sprintNumber: sprint,
+      passed,
+      attempts,
+      evalResult: lastEval,
+    });
+
+    if (passed) {
+      progress.completedSprints++;
+    } else {
+      progress.status = "failed";
+      await writeProgress(config.workDir, progress);
+      logError("HARNESS", `Harness stopped: sprint ${sprint} could not pass evaluation`);
+      break;
+    }
+  }
+
+  // Final status
+  const allPassed = results.every((r) => r.passed);
+  progress.status = allPassed ? "complete" : "failed";
+  await writeProgress(config.workDir, progress);
+
+  const totalDuration = Date.now() - startTime;
+  logDivider();
+  log("HARNESS", `Harness ${allPassed ? "COMPLETED" : "FAILED"} in ${(totalDuration / 1000 / 60).toFixed(1)} minutes`);
+  log("HARNESS", `Sprints: ${results.filter((r) => r.passed).length}/${results.length} passed`);
+
+  // Save conversation log
+  clog.system(`Harness ${allPassed ? "COMPLETED" : "FAILED"} — ${results.filter(r => r.passed).length}/${results.length} sprints passed in ${(totalDuration / 1000 / 60).toFixed(1)} min`);
+  const { mdPath, jsonlPath } = await clog.save();
+  log("HARNESS", `Conversation log saved: ${mdPath}`);
+  log("HARNESS", `JSONL log saved: ${jsonlPath}`);
+
+  return { success: allPassed, sprints: results, totalDurationMs: totalDuration };
+}
+
+async function negotiateContract(
+  workDir: string,
+  spec: string,
+  sprintNumber: number,
+  clog: ConversationLogger,
+): Promise<SprintContract> {
+  const maxRounds = 3;
+  let round = 0;
+  let proposalText = "";
+  let reviewText = "";
+  let approved = false;
+
+  while (round < maxRounds && !approved) {
+    round++;
+    log("HARNESS", `Contract negotiation round ${round}/${maxRounds}`);
+    clog.system(`Contract negotiation round ${round}/${maxRounds} for sprint ${sprintNumber}`);
+
+    // Generator proposes or counter-proposes
+    let generatorPrompt: string;
+    if (round === 1) {
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
+    } else {
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\n## Evaluator Feedback\n\nThe evaluator reviewed the contract and provided this feedback:\n\n${reviewText}\n\nPlease revise the contract based on this feedback. If the evaluator approved, output "APPROVED". Otherwise, output a revised contract.`;
+    }
+
+    const proposalOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    proposalText = "";
+    for await (const msg of query({ prompt: generatorPrompt, options: proposalOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            proposalText += block.text;
+          }
+        }
+      }
+    }
+
+    clog.response("CONTRACT_GEN", CLAUDE_MODEL, proposalText, { sprint: sprintNumber, round });
+    // Check if generator approved (only in subsequent rounds)
+    if (round > 1 && proposalText.trim() === "APPROVED") {
+      approved = true;
+      log("HARNESS", "Generator accepted evaluator revisions, contract finalized");
+      break;
+    }
+
+    // Evaluator reviews contract
+    const reviewPrompt = `## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
+
+    const reviewOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    reviewText = "";
+    for await (const msg of query({ prompt: reviewPrompt, options: reviewOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            reviewText += block.text;
+          }
+        }
+      }
+    }
+
+    clog.response("CONTRACT_EVAL", CLAUDE_MODEL, reviewText, { sprint: sprintNumber, round });
+
+    // Check if evaluator approved
+    if (reviewText.trim().toUpperCase().startsWith("APPROVED")) {
+      approved = true;
+      log("HARNESS", `Contract approved by evaluator in round ${round}`);
+      break;
+    }
+
+    if (round >= maxRounds) {
+      log("HARNESS", `Max negotiation rounds (${maxRounds}) reached, using evaluator version`);
+    }
+  }
+
+  const contractSource = reviewText.trim().toUpperCase().startsWith("APPROVED") ? proposalText : reviewText;
+  return parseContract(contractSource, sprintNumber);
+}
+
+function parseContract(text: string, sprintNumber: number): SprintContract {
+  const candidates: string[] = [];
+  const codeBlocks = [...text.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+  for (const match of codeBlocks.reverse()) {
+    if (match[1]) candidates.push(match[1].trim());
+  }
+  const braceMatch = text.match(/\{[\s\S]*"criteria"[\s\S]*\}/);
+  if (braceMatch) candidates.push(braceMatch[0]);
+  candidates.push(text.trim());
+
+  for (const candidate of candidates) {
+    try {
+      const parsed = JSON.parse(candidate) as SprintContract;
+      if (parsed.criteria && Array.isArray(parsed.criteria)) {
+        parsed.sprintNumber = sprintNumber;
+        return parsed;
+      }
+    } catch {
+      // Try next candidate
+    }
+  }
+
+  throw new Error(`Contract negotiation produced unparseable output. Raw text: ${text.slice(0, 200)}`);
+}
diff --git a/gemini-harness/index.ts b/gemini-harness/index.ts
new file mode 100644
index 0000000..9dd9e1b
--- /dev/null
+++ b/gemini-harness/index.ts
@@ -0,0 +1,63 @@
+import { resolve } from "path";
+import { readFile } from "fs/promises";
+import { runHarness } from "./harness.ts";
+import { DEFAULT_CONFIG } from "../shared/config.ts";
+import { log, logError, logDivider } from "../shared/logger.ts";
+import type { HarnessConfig } from "../shared/types.ts";
+
+let userPrompt: string | undefined;
+
+const arg = process.argv[2];
+if (arg === "--file" || arg === "-f") {
+  const filePath = process.argv[3];
+  if (!filePath) {
+    console.error("Error: --file requires a path argument");
+    process.exit(1);
+  }
+  userPrompt = await readFile(resolve(filePath), "utf-8");
+} else {
+  userPrompt = arg;
+}
+
+if (!userPrompt) {
+  console.error("Usage: bun run gemini-harness/index.ts <prompt>");
+  console.error('       bun run gemini-harness/index.ts --file <path-to-prompt.md>');
+  console.error('Example: bun run gemini-harness/index.ts "Build a task manager with REST API and dashboard"');
+  process.exit(1);
+}
+
+const config = {
+  ...DEFAULT_CONFIG,
+  userPrompt,
+  workDir: resolve("workspace/gemini"),
+    logDir: resolve(process.env.HARNESS_LOG_DIR || "./logs"),
+};
+
+logDivider();
+log("HARNESS", "ADVERSARIAL DEV - Gemini Harness (Claude Opus Generator + Gemini 3.1 Pro Evaluator)");
+log("HARNESS", `Prompt: "${userPrompt}"`);
+logDivider();
+
+try {
+  const result = await runHarness(config);
+
+  logDivider();
+  if (result.success) {
+    log("HARNESS", "All sprints completed successfully!");
+  } else {
+    logError("HARNESS", "Harness completed with failures.");
+  }
+
+  log("HARNESS", `Total time: ${(result.totalDurationMs / 1000 / 60).toFixed(1)} minutes`);
+  log("HARNESS", `Sprints passed: ${result.sprints.filter((s) => s.passed).length}/${result.sprints.length}`);
+
+  for (const sprint of result.sprints) {
+    const status = sprint.passed ? "\x1b[32mPASS\x1b[0m" : "\x1b[31mFAIL\x1b[0m";
+    log("HARNESS", `  Sprint ${sprint.sprintNumber}: [${status}] (${sprint.attempts} attempts)`);
+  }
+
+  process.exit(result.success ? 0 : 1);
+} catch (error) {
+  logError("HARNESS", `Fatal error: ${error instanceof Error ? error.message : String(error)}`);
+  process.exit(1);
+}
diff --git a/gemini-harness/planner.ts b/gemini-harness/planner.ts
new file mode 100644
index 0000000..f561569
--- /dev/null
+++ b/gemini-harness/planner.ts
@@ -0,0 +1,69 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import { readFile } from "fs/promises";
+import { join } from "path";
+import { PLANNER_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { CLAUDE_MODEL, CLAUDE_MAX_TURNS } from "../shared/config.ts";
+import { log, logError } from "../shared/logger.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runPlanner(userPrompt: string, workDir: string, clog: ConversationLogger): Promise<string> {
+  log("PLANNER", `[Claude/${CLAUDE_MODEL}] Starting planning for: "${userPrompt.slice(0, 80)}..."`);
+
+  const fullPrompt = `IMPORTANT: Your working directory is ${workDir}. All files you create (including spec.md) MUST be written inside this directory. Do NOT write files anywhere else.\n\n${userPrompt}`;
+
+  clog.prompt("PLANNER", CLAUDE_MODEL, fullPrompt);
+
+  const options: Options = {
+    cwd: workDir,
+    systemPrompt: PLANNER_SYSTEM_PROMPT,
+    permissionMode: "bypassPermissions",
+    allowDangerouslySkipPermissions: true,
+    tools: ["Read", "Write"],
+    model: CLAUDE_MODEL,
+    maxTurns: CLAUDE_MAX_TURNS,
+    persistSession: false,
+  };
+
+  let fullResponse = "";
+  let completed = false;
+  const startMs = Date.now();
+
+  for await (const msg of query({ prompt: fullPrompt, options })) {
+    if (msg.type === "assistant") {
+      const message = msg as { message: { content: Array<{ type: string; text?: string; name?: string; input?: any }> } };
+      for (const block of message.message.content) {
+        if (block.type === "text" && block.text) {
+          fullResponse += block.text;
+        } else if (block.type === "tool_use" && block.name) {
+          clog.toolCall("PLANNER", CLAUDE_MODEL, block.name, JSON.stringify(block.input ?? {}).slice(0, 500));
+        }
+      }
+    } else if (msg.type === "result") {
+      completed = true;
+      log("PLANNER", "Planning complete");
+    }
+  }
+
+  if (!completed) {
+    clog.error("PLANNER", "Planner query did not complete");
+    logError("PLANNER", "Planner query did not complete");
+    throw new Error("Planner failed to produce output");
+  }
+
+  if (!fullResponse) {
+    try {
+      fullResponse = await readFile(join(workDir, "spec.md"), "utf-8");
+      log("PLANNER", "Read spec from file written by planner agent");
+    } catch {
+      clog.error("PLANNER", "No text response and no spec.md on disk");
+      logError("PLANNER", "No text response and no spec.md on disk");
+      throw new Error("Planner completed but produced no spec");
+    }
+  }
+
+  const durationMs = Date.now() - startMs;
+  clog.response("PLANNER", CLAUDE_MODEL, fullResponse, { duration_ms: durationMs });
+
+  log("PLANNER", "Product specification generated");
+  return fullResponse;
+}
diff --git a/mixed-harness/evaluator.ts b/mixed-harness/evaluator.ts
new file mode 100644
index 0000000..e9c3d5b
--- /dev/null
+++ b/mixed-harness/evaluator.ts
@@ -0,0 +1,113 @@
+import { Codex } from "@openai/codex-sdk";
+import { EVALUATOR_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { CODEX_MODEL, CODEX_NETWORK_ACCESS } from "../shared/config.ts";
+import { log, logError } from "../shared/logger.ts";
+import type { SprintContract, EvalResult } from "../shared/types.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runEvaluator(
+  workDir: string,
+  contract: SprintContract,
+  passThreshold: number,
+  clog: ConversationLogger,
+): Promise<EvalResult> {
+  const sprint = contract.sprintNumber;
+  log("EVALUATOR", `[Codex/${CODEX_MODEL}] Evaluating sprint ${sprint} against ${contract.criteria.length} criteria`);
+
+  const taskPrompt = `## Sprint Contract to Evaluate Against
+
+${JSON.stringify(contract, null, 2)}
+
+## Pass Threshold
+
+Each criterion must score at least ${passThreshold}/10 to pass.
+
+## Instructions
+
+Examine the application in the \`app/\` directory. Read the code, run it if possible, and score each criterion. Output ONLY the JSON evaluation object.`;
+
+  const fullPrompt = `${EVALUATOR_SYSTEM_PROMPT}\n\n---\n\n${taskPrompt}`;
+
+  clog.prompt("EVALUATOR", CODEX_MODEL, taskPrompt, { sprint });
+
+  const startMs = Date.now();
+  const codex = new Codex();
+  const thread = codex.startThread({
+    workingDirectory: workDir,
+    sandboxMode: "danger-full-access",
+    networkAccessEnabled: CODEX_NETWORK_ACCESS,
+    approvalPolicy: "never",
+    model: CODEX_MODEL,
+  });
+
+  const turn = await thread.run(fullPrompt);
+  const response = turn.finalResponse ?? "";
+  const durationMs = Date.now() - startMs;
+
+  log("EVALUATOR", `Evaluation complete for sprint ${sprint}`);
+
+  const evalResult = parseEvalResult(response, contract, passThreshold);
+
+  // Build scores map for logging
+  const scoresMap: Record<string, number> = {};
+  for (const f of evalResult.feedback) {
+    scoresMap[f.criterion] = f.score;
+  }
+
+  clog.response("EVALUATOR", CODEX_MODEL, response, {
+    sprint,
+    duration_ms: durationMs,
+    scores: scoresMap,
+  });
+
+  const passedCount = evalResult.feedback.filter((f) => f.score >= passThreshold).length;
+  const totalCount = evalResult.feedback.length;
+  const verdict = evalResult.passed ? "PASSED" : "FAILED";
+  log("EVALUATOR", `Sprint ${sprint}: ${verdict} (${passedCount}/${totalCount} criteria passed)`);
+
+  for (const item of evalResult.feedback) {
+    const status = item.score >= passThreshold ? "\x1b[32mPASS\x1b[0m" : "\x1b[31mFAIL\x1b[0m";
+    log("EVALUATOR", `  [${status}] ${item.criterion}: ${item.score}/10 - ${item.details.slice(0, 100)}`);
+  }
+
+  return evalResult;
+}
+
+function parseEvalResult(
+  response: string,
+  contract: SprintContract,
+  passThreshold: number,
+): EvalResult {
+  const candidates: string[] = [];
+  const codeBlocks = [...response.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+  for (const match of codeBlocks.reverse()) {
+    if (match[1]) candidates.push(match[1].trim());
+  }
+  const braceMatch = response.match(/\{[\s\S]*"passed"[\s\S]*"feedback"[\s\S]*\}/);
+  if (braceMatch) candidates.push(braceMatch[0]);
+  candidates.push(response.trim());
+
+  for (const candidate of candidates) {
+    try {
+      const parsed = JSON.parse(candidate) as EvalResult;
+      if (parsed.feedback && Array.isArray(parsed.feedback) && parsed.feedback.length > 0) {
+        parsed.passed = parsed.feedback.every((f) => f.score >= passThreshold);
+        return parsed;
+      }
+    } catch {
+      // Try next candidate
+    }
+  }
+
+  logError("EVALUATOR", "Failed to parse evaluation JSON from any extraction strategy");
+  return {
+    passed: false,
+    scores: {},
+    feedback: contract.criteria.map((c) => ({
+      criterion: c.name,
+      score: 0,
+      details: "Evaluator failed to produce parseable output",
+    })),
+    overallSummary: "Evaluation parsing failed. Raw response: " + response.slice(0, 500),
+  };
+}
diff --git a/mixed-harness/generator.ts b/mixed-harness/generator.ts
new file mode 100644
index 0000000..5b1a5ba
--- /dev/null
+++ b/mixed-harness/generator.ts
@@ -0,0 +1,74 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import { GENERATOR_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { CLAUDE_MODEL, CLAUDE_MAX_TURNS } from "../shared/config.ts";
+import { log } from "../shared/logger.ts";
+import type { SprintContract, EvalResult } from "../shared/types.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runGenerator(
+  workDir: string,
+  spec: string,
+  contract: SprintContract,
+  clog: ConversationLogger,
+  previousFeedback?: EvalResult,
+): Promise<{ response: string; sessionId?: string }> {
+  const sprint = contract.sprintNumber;
+  const attempt = previousFeedback ? "retry" : "initial";
+  log("GENERATOR", `[Claude/${CLAUDE_MODEL}] Sprint ${sprint} (${attempt}) - Building: ${contract.features.join(", ")}`);
+
+  let prompt = `IMPORTANT: Your working directory is ${workDir}. All code MUST be created inside ${workDir}/app/. Do NOT create files outside of ${workDir}.\n\n## Product Spec\n\n${spec}\n\n## Sprint Contract\n\n${JSON.stringify(contract, null, 2)}`;
+
+  if (previousFeedback) {
+    prompt += `\n\n## Evaluation Feedback (MUST ADDRESS)\n\n${JSON.stringify(previousFeedback, null, 2)}`;
+    prompt += `\n\nThe previous attempt failed evaluation. Address every issue in the feedback above.`;
+  } else {
+    prompt += `\n\nImplement the features listed in this sprint contract. Work in the \`app/\` directory.`;
+  }
+
+  clog.prompt("GENERATOR", CLAUDE_MODEL, prompt, { sprint, attempt: previousFeedback ? 2 : 1 });
+
+  const options: Options = {
+    cwd: workDir,
+    systemPrompt: GENERATOR_SYSTEM_PROMPT,
+    permissionMode: "bypassPermissions",
+    allowDangerouslySkipPermissions: true,
+    tools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
+    model: CLAUDE_MODEL,
+    maxTurns: CLAUDE_MAX_TURNS,
+    persistSession: true,
+  };
+
+  let fullResponse = "";
+  let sessionId: string | undefined;
+  const startMs = Date.now();
+
+  for await (const msg of query({ prompt, options })) {
+    if (msg.type === "assistant") {
+      const message = msg as { message: { content: Array<{ type: string; text?: string; name?: string; input?: any }> } };
+      for (const block of message.message.content) {
+        if (block.type === "text" && block.text) {
+          fullResponse += block.text;
+        } else if (block.type === "tool_use" && block.name) {
+          log("GENERATOR", `  Tool: ${block.name}`);
+          clog.toolCall("GENERATOR", CLAUDE_MODEL, block.name, JSON.stringify(block.input ?? {}).slice(0, 500));
+        }
+      }
+    } else if (msg.type === "result") {
+      const result = msg as { session_id?: string };
+      sessionId = result.session_id;
+      log("GENERATOR", `Sprint ${sprint} build complete (session: ${sessionId?.slice(0, 8)}...)`);
+    }
+  }
+
+  const durationMs = Date.now() - startMs;
+  clog.response("GENERATOR", CLAUDE_MODEL, fullResponse || "(tools only, no text output)", {
+    sprint,
+    duration_ms: durationMs,
+  });
+
+  if (!fullResponse) {
+    log("GENERATOR", `Sprint ${sprint} completed (agent used tools only, no text output)`);
+  }
+
+  return { response: fullResponse, sessionId };
+}
diff --git a/mixed-harness/harness.ts b/mixed-harness/harness.ts
new file mode 100644
index 0000000..bab0e2d
--- /dev/null
+++ b/mixed-harness/harness.ts
@@ -0,0 +1,327 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import {
+  CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
+  CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
+} from "../shared/prompts.ts";
+import { CLAUDE_MODEL } from "../shared/config.ts";
+import { log, logError, logDivider } from "../shared/logger.ts";
+import { ConversationLogger } from "../shared/conversation-logger.ts";
+import {
+  initWorkspace,
+  writeSpec,
+  readSpec,
+  writeContract,
+  writeFeedback,
+  writeProgress,
+} from "../shared/files.ts";
+import type {
+  HarnessConfig,
+  SprintContract,
+  EvalResult,
+  HarnessProgress,
+  SprintResult,
+  HarnessResult,
+} from "../shared/types.ts";
+import { runPlanner } from "./planner.ts";
+import { runGenerator } from "./generator.ts";
+import { runEvaluator } from "./evaluator.ts";
+
+export async function runHarness(config: HarnessConfig & { logDir?: string }): Promise<HarnessResult> {
+  const startTime = Date.now();
+  const results: SprintResult[] = [];
+  const logDir = config.logDir || "./logs";
+  const clog = new ConversationLogger(logDir);
+  clog.system(`Mixed Harness started — Generator: Claude (${CLAUDE_MODEL}), Evaluator: Codex (gpt-5.4)`);
+  log("HARNESS", "ADVERSARIAL DEV - Mixed Harness (Claude Generator + Codex Evaluator)");
+  log("HARNESS", `Work directory: ${config.workDir}`);
+  log("HARNESS", `Max sprints: ${config.maxSprints} | Max retries: ${config.maxRetriesPerSprint} | Threshold: ${config.passThreshold}/10`);
+
+  await initWorkspace(config.workDir);
+
+  // Phase 1: Planning (Claude)
+  logDivider();
+  log("HARNESS", "PHASE 1: PLANNING (Claude Opus)");
+  logDivider();
+
+  const progress: HarnessProgress = {
+    status: "planning",
+    currentSprint: 0,
+    totalSprints: 0,
+    completedSprints: 0,
+    retryCount: 0,
+  };
+  await writeProgress(config.workDir, progress);
+
+  const plannerResponse = await runPlanner(config.userPrompt, config.workDir, clog);
+
+  let spec: string;
+  try {
+    spec = await readSpec(config.workDir);
+  } catch {
+    log("HARNESS", "Planner returned spec as text, writing to spec.md");
+    await writeSpec(config.workDir, plannerResponse);
+    spec = plannerResponse;
+  }
+
+  // Parse sprint count from spec
+  const sprintNumbers = Array.from(spec.matchAll(/sprint\s+(\d+)/gi))
+    .map((m) => parseInt(m[1]!, 10))
+    .filter((n) => n > 0 && n <= config.maxSprints);
+  const totalSprints = sprintNumbers.length > 0
+    ? Math.min(Math.max(...sprintNumbers), config.maxSprints)
+    : 3;
+
+  progress.totalSprints = totalSprints;
+  log("HARNESS", `Planner produced ${totalSprints} sprints`);
+
+  // Phase 2-4: Sprint Loop
+  for (let sprint = 1; sprint <= totalSprints; sprint++) {
+    logDivider();
+    log("HARNESS", `SPRINT ${sprint}/${totalSprints}`);
+    logDivider();
+
+    // Phase 2: Contract Negotiation (Claude proposes, Claude reviews — same model for contract alignment)
+    progress.status = "negotiating";
+    progress.currentSprint = sprint;
+    progress.retryCount = 0;
+    await writeProgress(config.workDir, progress);
+
+    log("HARNESS", "Negotiating sprint contract...");
+    let contract: SprintContract;
+    let negotiationAttempts = 0;
+    const maxNegotiationAttempts = 2;
+    while (true) {
+      try {
+        contract = await negotiateContract(config.workDir, spec, sprint, clog);
+        break;
+      } catch (e) {
+        negotiationAttempts++;
+        if (negotiationAttempts >= maxNegotiationAttempts) {
+          logError("HARNESS", `Contract negotiation failed after ${negotiationAttempts} attempts: ${e}`);
+          throw e;
+        }
+        log("HARNESS", `Contract negotiation produced invalid output, retrying (${negotiationAttempts}/${maxNegotiationAttempts})...`);
+      }
+    }
+    await writeContract(config.workDir, contract);
+    log("HARNESS", `Contract agreed: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+
+    // Phase 3-4: Build (Claude) → Evaluate (Codex) Loop
+    let passed = false;
+    let lastEval: EvalResult | undefined;
+    let attempts = 0;
+
+    for (let retry = 0; retry <= config.maxRetriesPerSprint; retry++) {
+      attempts = retry + 1;
+
+      // Build (Claude Opus)
+      log("HARNESS", `--- BUILD ATTEMPT ${attempts} (Claude Opus) ---`);
+      progress.status = "building";
+      progress.retryCount = retry;
+      await writeProgress(config.workDir, progress);
+
+      await runGenerator(config.workDir, spec, contract, clog, lastEval);
+
+      // Evaluate (Codex GPT-5.4)
+      log("HARNESS", `--- EVALUATION (Codex GPT-5.4) ---`);
+      progress.status = "evaluating";
+      await writeProgress(config.workDir, progress);
+
+      lastEval = await runEvaluator(config.workDir, contract, config.passThreshold, clog);
+      await writeFeedback(config.workDir, sprint, retry, lastEval);
+
+      if (lastEval.passed) {
+        passed = true;
+        log("HARNESS", `Sprint ${sprint} PASSED on attempt ${attempts}`);
+        break;
+      }
+
+      if (retry < config.maxRetriesPerSprint) {
+        log("HARNESS", `Sprint ${sprint} failed attempt ${attempts}, retrying...`);
+
+        // Check if we should renegotiate criteria
+        if (retry >= 1 && lastEval && lastEval.feedback.length > 0) {
+          const avgScore = lastEval.feedback.reduce((sum, f) => sum + f.score, 0) / lastEval.feedback.length;
+          const allFailing = lastEval.feedback.every(f => f.score < (contract.criteria.find(c => c.name === f.criterion)?.threshold ?? 7));
+
+          // Renegotiate if average score is very low or all criteria are failing
+          if (allFailing || avgScore < 4) {
+            if (allFailing) {
+              log("HARNESS", `All criteria failing (avg score: ${avgScore.toFixed(1)}), renegotiating contract...`);
+            } else {
+              log("HARNESS", `Low average score (${avgScore.toFixed(1)}), renegotiating contract...`);
+            }
+            try {
+              contract = await negotiateContract(config.workDir, spec, sprint, clog);
+              await writeContract(config.workDir, contract);
+              log("HARNESS", `Renegotiated contract: ${contract.criteria.length} criteria for ${contract.features.length} features`);
+            } catch (e) {
+              logError("HARNESS", `Renegotiation failed, continuing with current contract: ${e}`);
+            }
+          }
+        }
+      } else {
+        logError("HARNESS", `Sprint ${sprint} FAILED after ${attempts} attempts`);
+      }
+    }
+
+    results.push({
+      sprintNumber: sprint,
+      passed,
+      attempts,
+      evalResult: lastEval,
+    });
+
+    if (passed) {
+      progress.completedSprints++;
+    } else {
+      progress.status = "failed";
+      await writeProgress(config.workDir, progress);
+      logError("HARNESS", `Harness stopped: sprint ${sprint} could not pass evaluation`);
+      break;
+    }
+  }
+
+  // Final status
+  const allPassed = results.every((r) => r.passed);
+  progress.status = allPassed ? "complete" : "failed";
+  await writeProgress(config.workDir, progress);
+
+  const totalDuration = Date.now() - startTime;
+  logDivider();
+  log("HARNESS", `Harness ${allPassed ? "COMPLETED" : "FAILED"} in ${(totalDuration / 1000 / 60).toFixed(1)} minutes`);
+  log("HARNESS", `Sprints: ${results.filter((r) => r.passed).length}/${results.length} passed`);
+
+  // Save conversation log
+  clog.system(`Harness ${allPassed ? "COMPLETED" : "FAILED"} — ${results.filter(r => r.passed).length}/${results.length} sprints passed in ${(totalDuration / 1000 / 60).toFixed(1)} min`);
+  const { mdPath, jsonlPath } = await clog.save();
+  log("HARNESS", `Conversation log saved: ${mdPath}`);
+  log("HARNESS", `JSONL log saved: ${jsonlPath}`);
+
+  return { success: allPassed, sprints: results, totalDurationMs: totalDuration };
+}
+
+async function negotiateContract(
+  workDir: string,
+  spec: string,
+  sprintNumber: number,
+  clog: ConversationLogger,
+): Promise<SprintContract> {
+  const maxRounds = 3;
+  let round = 0;
+  let proposalText = "";
+  let reviewText = "";
+  let approved = false;
+
+  while (round < maxRounds && !approved) {
+    round++;
+    log("HARNESS", `Contract negotiation round ${round}/${maxRounds}`);
+    clog.system(`Contract negotiation round ${round}/${maxRounds} for sprint ${sprintNumber}`);
+
+    // Generator proposes or counter-proposes
+    let generatorPrompt: string;
+    if (round === 1) {
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\nPropose a sprint contract for this sprint.`;
+    } else {
+      generatorPrompt = `## Product Spec\n\n${spec}\n\n## Sprint Number: ${sprintNumber}\n\n## Evaluator Feedback\n\nThe evaluator reviewed the contract and provided this feedback:\n\n${reviewText}\n\nPlease revise the contract based on this feedback. If the evaluator approved, output "APPROVED". Otherwise, output a revised contract.`;
+    }
+
+    const proposalOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_GENERATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    proposalText = "";
+    for await (const msg of query({ prompt: generatorPrompt, options: proposalOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            proposalText += block.text;
+          }
+        }
+      }
+    }
+
+    clog.response("CONTRACT_GEN", CLAUDE_MODEL, proposalText, { sprint: sprintNumber, round });
+    // Check if generator approved (only in subsequent rounds)
+    if (round > 1 && proposalText.trim() === "APPROVED") {
+      approved = true;
+      log("HARNESS", "Generator accepted evaluator revisions, contract finalized");
+      break;
+    }
+
+    // Evaluator reviews contract
+    const reviewPrompt = `## Proposed Sprint Contract\n\n${proposalText}\n\nReview this contract.`;
+
+    const reviewOptions: Options = {
+      cwd: workDir,
+      systemPrompt: CONTRACT_NEGOTIATION_EVALUATOR_PROMPT,
+      permissionMode: "bypassPermissions",
+      allowDangerouslySkipPermissions: true,
+      tools: ["Read"],
+      model: CLAUDE_MODEL,
+      maxTurns: 10,
+      persistSession: false,
+    };
+
+    reviewText = "";
+    for await (const msg of query({ prompt: reviewPrompt, options: reviewOptions })) {
+      if (msg.type === "assistant") {
+        const message = msg as { message: { content: Array<{ type: string; text?: string }> } };
+        for (const block of message.message.content) {
+          if (block.type === "text" && block.text) {
+            reviewText += block.text;
+          }
+        }
+      }
+    }
+
+    clog.response("CONTRACT_EVAL", CLAUDE_MODEL, reviewText, { sprint: sprintNumber, round });
+
+    // Check if evaluator approved
+    if (reviewText.trim().toUpperCase().startsWith("APPROVED")) {
+      approved = true;
+      log("HARNESS", `Contract approved by evaluator in round ${round}`);
+      break;
+    }
+
+    if (round >= maxRounds) {
+      log("HARNESS", `Max negotiation rounds (${maxRounds}) reached, using evaluator version`);
+    }
+  }
+
+  const contractSource = reviewText.trim().toUpperCase().startsWith("APPROVED") ? proposalText : reviewText;
+  return parseContract(contractSource, sprintNumber);
+}
+
+function parseContract(text: string, sprintNumber: number): SprintContract {
+  const candidates: string[] = [];
+  const codeBlocks = [...text.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+  for (const match of codeBlocks.reverse()) {
+    if (match[1]) candidates.push(match[1].trim());
+  }
+  const braceMatch = text.match(/\{[\s\S]*"criteria"[\s\S]*\}/);
+  if (braceMatch) candidates.push(braceMatch[0]);
+  candidates.push(text.trim());
+
+  for (const candidate of candidates) {
+    try {
+      const parsed = JSON.parse(candidate) as SprintContract;
+      if (parsed.criteria && Array.isArray(parsed.criteria)) {
+        parsed.sprintNumber = sprintNumber;
+        return parsed;
+      }
+    } catch {
+      // Try next candidate
+    }
+  }
+
+  throw new Error(`Contract negotiation produced unparseable output. Raw text: ${text.slice(0, 200)}`);
+}
diff --git a/mixed-harness/index.ts b/mixed-harness/index.ts
new file mode 100644
index 0000000..814c328
--- /dev/null
+++ b/mixed-harness/index.ts
@@ -0,0 +1,63 @@
+import { resolve } from "path";
+import { readFile } from "fs/promises";
+import { runHarness } from "./harness.ts";
+import { DEFAULT_CONFIG } from "../shared/config.ts";
+import { log, logError, logDivider } from "../shared/logger.ts";
+import type { HarnessConfig } from "../shared/types.ts";
+
+let userPrompt: string | undefined;
+
+const arg = process.argv[2];
+if (arg === "--file" || arg === "-f") {
+  const filePath = process.argv[3];
+  if (!filePath) {
+    console.error("Error: --file requires a path argument");
+    process.exit(1);
+  }
+  userPrompt = await readFile(resolve(filePath), "utf-8");
+} else {
+  userPrompt = arg;
+}
+
+if (!userPrompt) {
+  console.error("Usage: bun run mixed-harness/index.ts <prompt>");
+  console.error('       bun run mixed-harness/index.ts --file <path-to-prompt.md>');
+  console.error('Example: bun run mixed-harness/index.ts "Build a task manager with REST API and dashboard"');
+  process.exit(1);
+}
+
+const config = {
+  ...DEFAULT_CONFIG,
+  userPrompt,
+  workDir: resolve("workspace/mixed"),
+    logDir: resolve(process.env.HARNESS_LOG_DIR || "./logs"),
+};
+
+logDivider();
+log("HARNESS", "ADVERSARIAL DEV - Mixed Harness (Claude Opus Generator + Codex GPT-5.4 Evaluator)");
+log("HARNESS", `Prompt: "${userPrompt}"`);
+logDivider();
+
+try {
+  const result = await runHarness(config);
+
+  logDivider();
+  if (result.success) {
+    log("HARNESS", "All sprints completed successfully!");
+  } else {
+    logError("HARNESS", "Harness completed with failures.");
+  }
+
+  log("HARNESS", `Total time: ${(result.totalDurationMs / 1000 / 60).toFixed(1)} minutes`);
+  log("HARNESS", `Sprints passed: ${result.sprints.filter((s) => s.passed).length}/${result.sprints.length}`);
+
+  for (const sprint of result.sprints) {
+    const status = sprint.passed ? "\x1b[32mPASS\x1b[0m" : "\x1b[31mFAIL\x1b[0m";
+    log("HARNESS", `  Sprint ${sprint.sprintNumber}: [${status}] (${sprint.attempts} attempts)`);
+  }
+
+  process.exit(result.success ? 0 : 1);
+} catch (error) {
+  logError("HARNESS", `Fatal error: ${error instanceof Error ? error.message : String(error)}`);
+  process.exit(1);
+}
diff --git a/mixed-harness/planner.ts b/mixed-harness/planner.ts
new file mode 100644
index 0000000..f561569
--- /dev/null
+++ b/mixed-harness/planner.ts
@@ -0,0 +1,69 @@
+import { query, type Options } from "@anthropic-ai/claude-agent-sdk";
+import { readFile } from "fs/promises";
+import { join } from "path";
+import { PLANNER_SYSTEM_PROMPT } from "../shared/prompts.ts";
+import { CLAUDE_MODEL, CLAUDE_MAX_TURNS } from "../shared/config.ts";
+import { log, logError } from "../shared/logger.ts";
+import type { ConversationLogger } from "../shared/conversation-logger.ts";
+
+export async function runPlanner(userPrompt: string, workDir: string, clog: ConversationLogger): Promise<string> {
+  log("PLANNER", `[Claude/${CLAUDE_MODEL}] Starting planning for: "${userPrompt.slice(0, 80)}..."`);
+
+  const fullPrompt = `IMPORTANT: Your working directory is ${workDir}. All files you create (including spec.md) MUST be written inside this directory. Do NOT write files anywhere else.\n\n${userPrompt}`;
+
+  clog.prompt("PLANNER", CLAUDE_MODEL, fullPrompt);
+
+  const options: Options = {
+    cwd: workDir,
+    systemPrompt: PLANNER_SYSTEM_PROMPT,
+    permissionMode: "bypassPermissions",
+    allowDangerouslySkipPermissions: true,
+    tools: ["Read", "Write"],
+    model: CLAUDE_MODEL,
+    maxTurns: CLAUDE_MAX_TURNS,
+    persistSession: false,
+  };
+
+  let fullResponse = "";
+  let completed = false;
+  const startMs = Date.now();
+
+  for await (const msg of query({ prompt: fullPrompt, options })) {
+    if (msg.type === "assistant") {
+      const message = msg as { message: { content: Array<{ type: string; text?: string; name?: string; input?: any }> } };
+      for (const block of message.message.content) {
+        if (block.type === "text" && block.text) {
+          fullResponse += block.text;
+        } else if (block.type === "tool_use" && block.name) {
+          clog.toolCall("PLANNER", CLAUDE_MODEL, block.name, JSON.stringify(block.input ?? {}).slice(0, 500));
+        }
+      }
+    } else if (msg.type === "result") {
+      completed = true;
+      log("PLANNER", "Planning complete");
+    }
+  }
+
+  if (!completed) {
+    clog.error("PLANNER", "Planner query did not complete");
+    logError("PLANNER", "Planner query did not complete");
+    throw new Error("Planner failed to produce output");
+  }
+
+  if (!fullResponse) {
+    try {
+      fullResponse = await readFile(join(workDir, "spec.md"), "utf-8");
+      log("PLANNER", "Read spec from file written by planner agent");
+    } catch {
+      clog.error("PLANNER", "No text response and no spec.md on disk");
+      logError("PLANNER", "No text response and no spec.md on disk");
+      throw new Error("Planner completed but produced no spec");
+    }
+  }
+
+  const durationMs = Date.now() - startMs;
+  clog.response("PLANNER", CLAUDE_MODEL, fullResponse, { duration_ms: durationMs });
+
+  log("PLANNER", "Product specification generated");
+  return fullResponse;
+}
diff --git a/package.json b/package.json
index 260e36a..0b23bee 100644
--- a/package.json
+++ b/package.json
@@ -11,6 +11,7 @@
   },
   "dependencies": {
     "@anthropic-ai/claude-agent-sdk": "^0.2.85",
+    "@google/genai": "^1.48.0",
     "@openai/codex-sdk": "^0.117.0"
   }
 }
diff --git a/shared/config.ts b/shared/config.ts
index 821c963..f3c1e0c 100644
--- a/shared/config.ts
+++ b/shared/config.ts
@@ -6,8 +6,11 @@ export const DEFAULT_CONFIG: Omit<HarnessConfig, "userPrompt" | "workDir"> = {
   passThreshold: 7,
 };
 
-export const CLAUDE_MODEL = "claude-sonnet-4-6";
+export const CLAUDE_MODEL = "claude-opus-4-6";
 export const CODEX_MODEL = "gpt-5.4";
 
 export const CLAUDE_MAX_TURNS = 50;
 export const CODEX_NETWORK_ACCESS = true;
+
+export const GEMINI_MODEL = "gemini-3.1-pro-preview";
+export const GEMINI_API_KEY = process.env.GEMINI_API_KEY ?? "";
diff --git a/shared/conversation-logger.ts b/shared/conversation-logger.ts
new file mode 100644
index 0000000..639f287
--- /dev/null
+++ b/shared/conversation-logger.ts
@@ -0,0 +1,188 @@
+import { writeFile, mkdir } from "fs/promises";
+import { join } from "path";
+
+type AgentRole = "PLANNER" | "GENERATOR" | "EVALUATOR" | "HARNESS" | "CONTRACT_GEN" | "CONTRACT_EVAL";
+type MessageType = "prompt" | "response" | "tool_call" | "tool_result" | "system" | "error";
+
+interface ConversationEntry {
+  timestamp: string;
+  role: AgentRole;
+  model: string;
+  type: MessageType;
+  content: string;
+  metadata?: {
+    sprint?: number;
+    attempt?: number;
+    round?: number;
+    toolName?: string;
+    duration_ms?: number;
+    scores?: Record<string, number>;
+  };
+}
+
+class ConversationLogger {
+  private entries: ConversationEntry[] = [];
+  private logDir: string;
+  private sessionId: string;
+  private startTime: number;
+
+  constructor(logDir: string) {
+    this.logDir = logDir;
+    this.sessionId = new Date().toISOString().replace(/[:.]/g, "-").slice(0, 19);
+    this.startTime = Date.now();
+  }
+
+  log(role: AgentRole, model: string, type: MessageType, content: string, metadata?: ConversationEntry["metadata"]) {
+    this.entries.push({
+      timestamp: new Date().toISOString(),
+      role,
+      model,
+      type,
+      content,
+      metadata,
+    });
+  }
+
+  prompt(role: AgentRole, model: string, content: string, metadata?: ConversationEntry["metadata"]) {
+    this.log(role, model, "prompt", content, metadata);
+  }
+
+  response(role: AgentRole, model: string, content: string, metadata?: ConversationEntry["metadata"]) {
+    this.log(role, model, "response", content, metadata);
+  }
+
+  toolCall(role: AgentRole, model: string, toolName: string, input: string) {
+    this.log(role, model, "tool_call", input, { toolName });
+  }
+
+  toolResult(role: AgentRole, model: string, toolName: string, output: string) {
+    this.log(role, model, "tool_result", output, { toolName });
+  }
+
+  system(message: string) {
+    this.log("HARNESS", "system", "system", message);
+  }
+
+  error(role: AgentRole, message: string) {
+    this.log(role, "system", "error", message);
+  }
+
+  /**
+   * Render as a beautiful markdown conversation log.
+   * Reads like a chat — who said what, with tool calls collapsed.
+   */
+  toMarkdown(): string {
+    const elapsed = ((Date.now() - this.startTime) / 1000 / 60).toFixed(1);
+    const lines: string[] = [];
+
+    lines.push(`# Adversarial Dev Harness — Conversation Log`);
+    lines.push(`Session: ${this.sessionId}`);
+    lines.push(`Duration: ${elapsed} minutes`);
+    lines.push(`Entries: ${this.entries.length}`);
+    lines.push(``);
+    lines.push(`---`);
+    lines.push(``);
+
+    const roleEmoji: Record<string, string> = {
+      PLANNER: "📋",
+      GENERATOR: "🔨",
+      EVALUATOR: "🔍",
+      HARNESS: "⚙️",
+      CONTRACT_GEN: "📝",
+      CONTRACT_EVAL: "✅",
+    };
+
+    const typeStyle: Record<string, string> = {
+      prompt: "**→ Prompt**",
+      response: "**← Response**",
+      tool_call: "🔧 Tool Call",
+      tool_result: "📤 Tool Result",
+      system: "💬 System",
+      error: "❌ Error",
+    };
+
+    for (const entry of this.entries) {
+      const emoji = roleEmoji[entry.role] || "❓";
+      const time = entry.timestamp.slice(11, 19);
+      const style = typeStyle[entry.type] || entry.type;
+
+      // Header line
+      lines.push(`### ${emoji} ${entry.role} (${entry.model}) — ${style}`);
+      lines.push(`*${time}*`);
+
+      // Metadata badges
+      if (entry.metadata) {
+        const badges: string[] = [];
+        if (entry.metadata.sprint !== undefined) badges.push(`Sprint ${entry.metadata.sprint}`);
+        if (entry.metadata.attempt !== undefined) badges.push(`Attempt ${entry.metadata.attempt}`);
+        if (entry.metadata.round !== undefined) badges.push(`Round ${entry.metadata.round}`);
+        if (entry.metadata.toolName) badges.push(`Tool: \`${entry.metadata.toolName}\``);
+        if (entry.metadata.duration_ms !== undefined) badges.push(`${entry.metadata.duration_ms}ms`);
+        if (badges.length > 0) {
+          lines.push(`> ${badges.join(" | ")}`);
+        }
+        if (entry.metadata.scores) {
+          lines.push(`> Scores: ${Object.entries(entry.metadata.scores).map(([k, v]) => `${k}=${v}`).join(", ")}`);
+        }
+      }
+
+      lines.push(``);
+
+      // Content — tool calls get collapsed
+      if (entry.type === "tool_call" || entry.type === "tool_result") {
+        lines.push(`<details>`);
+        lines.push(`<summary>${entry.metadata?.toolName || "tool"} ${entry.type === "tool_call" ? "input" : "output"}</summary>`);
+        lines.push(``);
+        lines.push("```");
+        lines.push(entry.content.slice(0, 2000));
+        if (entry.content.length > 2000) lines.push(`... (${entry.content.length} chars total)`);
+        lines.push("```");
+        lines.push(`</details>`);
+      } else if (entry.content.length > 500) {
+        // Long content gets a collapsible too
+        lines.push(`<details>`);
+        lines.push(`<summary>Show full content (${entry.content.length} chars)</summary>`);
+        lines.push(``);
+        lines.push(entry.content);
+        lines.push(`</details>`);
+      } else {
+        lines.push(entry.content);
+      }
+
+      lines.push(``);
+      lines.push(`---`);
+      lines.push(``);
+    }
+
+    return lines.join("\n");
+  }
+
+  /**
+   * Export as JSONL for programmatic analysis.
+   */
+  toJsonl(): string {
+    if (this.entries.length === 0) return "";
+    return this.entries.map(e => JSON.stringify(e)).join("\n") + "\n";
+  }
+
+  /**
+   * Save both markdown and JSONL to the log directory.
+   */
+  async save(): Promise<{ mdPath: string; jsonlPath: string }> {
+    await mkdir(this.logDir, { recursive: true });
+
+    const mdPath = join(this.logDir, `${this.sessionId}.md`);
+    const jsonlPath = join(this.logDir, `${this.sessionId}.jsonl`);
+
+    await writeFile(mdPath, this.toMarkdown(), "utf-8");
+    await writeFile(jsonlPath, this.toJsonl(), "utf-8");
+
+    return { mdPath, jsonlPath };
+  }
+
+  getEntryCount(): number {
+    return this.entries.length;
+  }
+}
+
+export { ConversationLogger, type ConversationEntry, type AgentRole, type MessageType };
diff --git a/tests/conversation-logger.test.ts b/tests/conversation-logger.test.ts
new file mode 100644
index 0000000..d595a02
--- /dev/null
+++ b/tests/conversation-logger.test.ts
@@ -0,0 +1,109 @@
+import { describe, test, expect } from "bun:test";
+import { ConversationLogger } from "../shared/conversation-logger.ts";
+import { existsSync } from "fs";
+import { rm } from "fs/promises";
+import { join } from "path";
+
+describe("ConversationLogger", () => {
+  test("logs entries with correct fields", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    logger.prompt("GENERATOR", "claude-opus-4-6", "Build the auth module", { sprint: 1 });
+    logger.response("GENERATOR", "claude-opus-4-6", "I'll create src/auth.ts...", { sprint: 1 });
+    logger.toolCall("GENERATOR", "claude-opus-4-6", "Write", '{"path": "src/auth.ts"}');
+    expect(logger.getEntryCount()).toBe(3);
+  });
+
+  test("toMarkdown produces readable output", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    logger.system("Starting harness");
+    logger.prompt("PLANNER", "claude-opus-4-6", "Plan the sprints");
+    logger.response("PLANNER", "claude-opus-4-6", "Sprint 1: Auth\nSprint 2: API");
+    logger.prompt("GENERATOR", "claude-opus-4-6", "Build sprint 1", { sprint: 1, attempt: 1 });
+    logger.response("GENERATOR", "claude-opus-4-6", "Created auth module");
+    logger.toolCall("GENERATOR", "claude-opus-4-6", "Write", "src/auth.ts contents...");
+    logger.prompt("EVALUATOR", "gpt-5.4", "Evaluate sprint 1", { sprint: 1 });
+    logger.response("EVALUATOR", "gpt-5.4", '{"passed": false}', { sprint: 1, scores: { auth: 5, quality: 7 } });
+
+    const md = logger.toMarkdown();
+
+    // Structure checks
+    expect(md).toContain("# Adversarial Dev Harness");
+    expect(md).toContain("PLANNER");
+    expect(md).toContain("GENERATOR");
+    expect(md).toContain("EVALUATOR");
+    expect(md).toContain("claude-opus-4-6");
+    expect(md).toContain("gpt-5.4");
+    // Tool calls should be in collapsible
+    expect(md).toContain("<details>");
+    expect(md).toContain("Write");
+    // Scores badge
+    expect(md).toContain("auth=5");
+    expect(md).toContain("quality=7");
+    // Sprint/attempt badges
+    expect(md).toContain("Sprint 1");
+    expect(md).toContain("Attempt 1");
+  });
+
+  test("toJsonl produces valid JSONL", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    logger.system("test");
+    logger.prompt("GENERATOR", "opus", "hello");
+    logger.error("EVALUATOR", "parse failed");
+
+    const jsonl = logger.toJsonl();
+    const lines = jsonl.trim().split("\n");
+    expect(lines).toHaveLength(3);
+
+    for (const line of lines) {
+      const parsed = JSON.parse(line);
+      expect(parsed.timestamp).toBeDefined();
+      expect(parsed.role).toBeDefined();
+      expect(parsed.type).toBeDefined();
+      expect(parsed.content).toBeDefined();
+    }
+  });
+
+  test("save writes files to disk", async () => {
+    const testDir = "/tmp/adversarial-test-logs-" + Date.now();
+    const logger = new ConversationLogger(testDir);
+    logger.system("test save");
+    logger.prompt("GENERATOR", "opus", "do the thing");
+
+    const { mdPath, jsonlPath } = await logger.save();
+
+    expect(existsSync(mdPath)).toBe(true);
+    expect(existsSync(jsonlPath)).toBe(true);
+
+    // Cleanup
+    await rm(testDir, { recursive: true });
+  });
+
+  test("long content gets collapsed in markdown", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    const longContent = "x".repeat(600);
+    logger.response("GENERATOR", "opus", longContent);
+
+    const md = logger.toMarkdown();
+    expect(md).toContain("<details>");
+    expect(md).toContain("600 chars");
+  });
+
+  test("tool calls show tool name in summary", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    logger.toolCall("GENERATOR", "opus", "Bash", "ls -la");
+    logger.toolResult("GENERATOR", "opus", "Bash", "total 42\n-rw-r--r-- 1 ...");
+
+    const md = logger.toMarkdown();
+    expect(md).toContain("Bash input");
+    expect(md).toContain("Bash output");
+  });
+
+  test("metadata badges render correctly", () => {
+    const logger = new ConversationLogger("/tmp/test-logs");
+    logger.prompt("CONTRACT_GEN", "opus", "propose contract", { sprint: 2, round: 3 });
+
+    const md = logger.toMarkdown();
+    expect(md).toContain("Sprint 2");
+    expect(md).toContain("Round 3");
+  });
+});
diff --git a/tests/mixed-harness.test.ts b/tests/mixed-harness.test.ts
new file mode 100644
index 0000000..3d82e70
--- /dev/null
+++ b/tests/mixed-harness.test.ts
@@ -0,0 +1,234 @@
+import { describe, test, expect } from "bun:test";
+
+// ============================================================
+// parseContract tests — fail-closed behavior
+// ============================================================
+describe("parseContract", () => {
+  function parseContract(text: string, sprintNumber: number) {
+    const candidates: string[] = [];
+    const codeBlocks = [...text.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+    for (const match of codeBlocks.reverse()) {
+      if (match[1]) candidates.push(match[1].trim());
+    }
+    const braceMatch = text.match(/\{[\s\S]*"criteria"[\s\S]*\}/);
+    if (braceMatch) candidates.push(braceMatch[0]);
+    candidates.push(text.trim());
+
+    for (const candidate of candidates) {
+      try {
+        const parsed = JSON.parse(candidate) as any;
+        if (parsed.criteria && Array.isArray(parsed.criteria)) {
+          parsed.sprintNumber = sprintNumber;
+          return parsed;
+        }
+      } catch { /* next */ }
+    }
+    throw new Error(`Contract negotiation produced unparseable output. Raw text: ${text.slice(0, 200)}`);
+  }
+
+  test("parses valid JSON from code block", () => {
+    const text = '```json\n{"features": ["auth"], "criteria": [{"name": "login", "description": "works", "threshold": 7}]}\n```';
+    const result = parseContract(text, 1);
+    expect(result.sprintNumber).toBe(1);
+    expect(result.criteria).toHaveLength(1);
+    expect(result.criteria[0].name).toBe("login");
+  });
+
+  test("parses raw JSON without code block", () => {
+    const text = '{"features": ["auth"], "criteria": [{"name": "login", "description": "works", "threshold": 7}]}';
+    const result = parseContract(text, 2);
+    expect(result.sprintNumber).toBe(2);
+  });
+
+  test("extracts JSON from surrounding prose", () => {
+    const text = 'Here is the contract:\n{"features": ["db"], "criteria": [{"name": "schema", "description": "valid", "threshold": 7}]}\nDone.';
+    const result = parseContract(text, 3);
+    expect(result.criteria[0].name).toBe("schema");
+  });
+
+  test("THROWS on garbage (fail-closed)", () => {
+    expect(() => parseContract("not json lol", 1)).toThrow("unparseable");
+  });
+
+  test("THROWS on JSON without criteria field", () => {
+    expect(() => parseContract('{"features": ["auth"]}', 1)).toThrow("unparseable");
+  });
+
+  test("THROWS on empty string", () => {
+    expect(() => parseContract("", 1)).toThrow("unparseable");
+  });
+
+  test("THROWS on criteria as string not array", () => {
+    expect(() => parseContract('{"criteria": "nope"}', 1)).toThrow("unparseable");
+  });
+
+  test("prefers last code block when multiple exist", () => {
+    const text = '```json\n{"features":["old"],"criteria":[{"name":"wrong","description":"x","threshold":5}]}\n```\n```json\n{"features":["new"],"criteria":[{"name":"right","description":"y","threshold":7}]}\n```';
+    const result = parseContract(text, 1);
+    expect(result.features).toEqual(["new"]);
+  });
+
+  test("overwrites sprintNumber", () => {
+    const text = '{"sprintNumber": 999, "features": ["x"], "criteria": [{"name": "a", "description": "b", "threshold": 7}]}';
+    const result = parseContract(text, 5);
+    expect(result.sprintNumber).toBe(5);
+  });
+});
+
+// ============================================================
+// Renegotiation trigger logic
+// ============================================================
+describe("renegotiation trigger", () => {
+  function shouldRenegotiate(
+    feedback: Array<{ criterion: string; score: number }>,
+    criteria: Array<{ name: string; threshold: number }>,
+  ): { trigger: boolean; reason: string } {
+    if (feedback.length === 0) return { trigger: false, reason: "empty feedback" };
+    const avgScore = feedback.reduce((s, f) => s + f.score, 0) / feedback.length;
+    const allFailing = feedback.every(f => f.score < (criteria.find(c => c.name === f.criterion)?.threshold ?? 7));
+    if (allFailing) return { trigger: true, reason: "all failing" };
+    if (avgScore < 4) return { trigger: true, reason: "low avg" };
+    return { trigger: false, reason: "ok" };
+  }
+
+  test("triggers when ALL criteria below threshold", () => {
+    const r = shouldRenegotiate(
+      [{ criterion: "a", score: 3 }, { criterion: "b", score: 5 }],
+      [{ name: "a", threshold: 7 }, { name: "b", threshold: 7 }],
+    );
+    expect(r.trigger).toBe(true);
+    expect(r.reason).toBe("all failing");
+  });
+
+  test("triggers when avg < 4", () => {
+    const r = shouldRenegotiate(
+      [{ criterion: "a", score: 2 }, { criterion: "b", score: 3 }, { criterion: "c", score: 4 }],
+      [{ name: "a", threshold: 7 }, { name: "b", threshold: 7 }, { name: "c", threshold: 3 }],
+    );
+    expect(r.trigger).toBe(true);
+    expect(r.reason).toBe("low avg");
+  });
+
+  test("does NOT trigger when some pass", () => {
+    const r = shouldRenegotiate(
+      [{ criterion: "a", score: 8 }, { criterion: "b", score: 5 }],
+      [{ name: "a", threshold: 7 }, { name: "b", threshold: 7 }],
+    );
+    expect(r.trigger).toBe(false);
+  });
+
+  test("safe on empty feedback (no division by zero)", () => {
+    const r = shouldRenegotiate([], []);
+    expect(r.trigger).toBe(false);
+    expect(r.reason).toBe("empty feedback");
+  });
+
+  test("uses per-criterion threshold", () => {
+    const r = shouldRenegotiate(
+      [{ criterion: "easy", score: 4 }, { criterion: "hard", score: 4 }],
+      [{ name: "easy", threshold: 3 }, { name: "hard", threshold: 5 }],
+    );
+    // easy passes (4>=3), so not allFailing. avg=4, not < 4.
+    expect(r.trigger).toBe(false);
+  });
+
+  test("defaults to threshold 7 for unknown criterion", () => {
+    const r = shouldRenegotiate(
+      [{ criterion: "mystery", score: 5 }],
+      [],
+    );
+    // 5 < 7 default → allFailing
+    expect(r.trigger).toBe(true);
+  });
+});
+
+// ============================================================
+// parseEvalResult tests
+// ============================================================
+describe("parseEvalResult", () => {
+  function parseEvalResult(response: string, passThreshold: number) {
+    const candidates: string[] = [];
+    const codeBlocks = [...response.matchAll(/```(?:json)?\s*([\s\S]*?)```/g)];
+    for (const match of codeBlocks.reverse()) {
+      if (match[1]) candidates.push(match[1].trim());
+    }
+    const braceMatch = response.match(/\{[\s\S]*"passed"[\s\S]*"feedback"[\s\S]*\}/);
+    if (braceMatch) candidates.push(braceMatch[0]);
+    candidates.push(response.trim());
+    for (const candidate of candidates) {
+      try {
+        const parsed = JSON.parse(candidate) as any;
+        if (parsed.feedback && Array.isArray(parsed.feedback)) {
+          parsed.passed = parsed.feedback.every((f: any) => f.score >= passThreshold);
+          return parsed;
+        }
+      } catch { /* next */ }
+    }
+    return null;
+  }
+
+  test("recalculates passed based on threshold", () => {
+    const json = JSON.stringify({
+      passed: true,
+      feedback: [{ criterion: "a", score: 8, details: "ok" }, { criterion: "b", score: 6, details: "meh" }],
+    });
+    const r = parseEvalResult(json, 7);
+    expect(r!.passed).toBe(false); // b=6 < 7
+  });
+
+  test("marks passed when all meet threshold", () => {
+    const json = JSON.stringify({
+      passed: false,
+      feedback: [{ criterion: "a", score: 9, details: "great" }, { criterion: "b", score: 7, details: "ok" }],
+    });
+    const r = parseEvalResult(json, 7);
+    expect(r!.passed).toBe(true);
+  });
+
+  test("extracts from markdown code block", () => {
+    const resp = '```json\n{"feedback":[{"criterion":"x","score":5,"details":"bad"}]}\n```';
+    const r = parseEvalResult(resp, 7);
+    expect(r).not.toBeNull();
+    expect(r!.passed).toBe(false);
+  });
+
+  test("returns null on garbage", () => {
+    expect(parseEvalResult("lol", 7)).toBeNull();
+  });
+});
+
+// ============================================================
+// Negotiation round logic
+// ============================================================
+describe("negotiation rounds", () => {
+  test("APPROVED on round 1 = 1 round total", () => {
+    let rounds = 0, approved = false;
+    while (rounds < 3 && !approved) {
+      rounds++;
+      if ("APPROVED" === "APPROVED") { approved = true; }
+    }
+    expect(rounds).toBe(1);
+    expect(approved).toBe(true);
+  });
+
+  test("never approved = exactly 3 rounds", () => {
+    let rounds = 0, approved = false;
+    while (rounds < 3 && !approved) {
+      rounds++;
+      if ("revised json" === "APPROVED") { approved = true; }
+    }
+    expect(rounds).toBe(3);
+    expect(approved).toBe(false);
+  });
+
+  test("generator accepts in round 2 = 2 rounds", () => {
+    let rounds = 0, approved = false;
+    while (rounds < 3 && !approved) {
+      rounds++;
+      if (rounds > 1 && "APPROVED" === "APPROVED") { approved = true; break; }
+      // evaluator reviews...
+    }
+    expect(rounds).toBe(2);
+    expect(approved).toBe(true);
+  });
+});