Skip to content

dementev-dev/adversarial-review

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adversarial Review

Claude Code skill for adversarial AI code and plan review.

One AI writes the code. Another tears it apart. Iterate until approved.

What is this

Most AI code review tools validate your changes — "looks good, maybe add tests." Adversarial review does the opposite: the reviewer defaults to skepticism and tries to break confidence in the change. It looks for what will fail in production, not what might be nice to improve.

This is a Claude Code skill — a SKILL.md file plus a small references/runner.md that together teach Claude how to run adversarial reviews through an external AI model (currently OpenAI Codex).

Key features

  • Plan review — review the plan BEFORE writing code. Catch architecture mistakes, missing steps, and risks early
  • Code review — review the implementation. Bugs, security, data loss
  • Code-vs-plan — verify the implementation matches the plan
  • Iterative — Claude fixes issues based on reviewer feedback and resubmits for re-review. Up to 5 rounds until approved
  • LightweightSKILL.md + one references/runner.md (thin runner subagent spec), no server, no broker, no external runtime deps. Compare with codex-plugin-cc: ~15 JS modules, App Server, JSON-RPC broker, lifecycle hooks

How it works

┌─────────┐     ┌──────────┐     ┌─────────┐
│  Claude  │────>│ Reviewer  │────>│  Claude  │
│  (code)  │     │ (Codex)   │     │  (fix)   │
└─────────┘     └──────────┘     └─────────┘
     ^                                 │
     │         ┌──────────┐            │
     └─────────│ Reviewer  │<───────────┘
               │(re-review)│
               └──────────┘
                    │
              VERDICT: APPROVED

Three modes

Mode What it reviews When to use
plan Implementation plan Before writing code
code Git diff (unstaged, staged, or branch) After writing code
code-vs-plan Code changes against the plan Verify implementation matches plan

Mode is auto-detected from context, or you can force it with an argument.

Quick start

1. Prerequisites

Claude Code and OpenAI Codex CLI must be installed.

Verify both are available:

claude --version   # Claude Code CLI
codex --version    # OpenAI Codex CLI (>= 0.115.0)

If Codex is missing: npm install -g @openai/codex

Authentication. Codex needs an OpenAI account. Either:

  • Sign in interactively: codex (opens browser)
  • Or set CODEX_API_KEY env var for non-interactive use

2. Install the skill

git clone https://github.com/dementev-dev/adversarial-review.git
mkdir -p ~/.claude/skills
ln -sfn "$(pwd)/adversarial-review" ~/.claude/skills/adversarial-review

Verify both the skill entry-point AND the runner subagent spec are in place:

ls -la ~/.claude/skills/adversarial-review/SKILL.md
ls -la ~/.claude/skills/adversarial-review/references/runner.md

Migrating from a previous install at ~/.agents/skills/: delete the old symlink (rm ~/.agents/skills/adversarial-review) and install at the new path above. Claude Code ≥ 2.x uses ~/.claude/skills/.

3. Add permissions

The skill runs git, codex exec, and writes temp files to /tmp. Without pre-approved permissions, Claude Code will prompt for each action.

Where to add. Since the skill is installed globally (~/.claude/skills/), permissions should go into the global config so they work in any project:

Install scope Config file
Global (recommended) ~/.claude/settings.json
Single project <project>/.claude/settings.local.json

Merge the following rules into the permissions.allow array of the chosen config file:

// --- adversarial-review permissions ---
// Git: diff, status, branch detection, repo root, submodule check
"Bash(git diff*)",
"Bash(git status*)",
"Bash(git symbolic-ref*)",
"Bash(git rev-parse*)",
// Codex: initial launch (uses -C; prompt fed via cat | pipe for env portability)
"Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
// Codex: resume (cd prefix because resume has no -C flag; prompt via cat | pipe)
"Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
// Session-id filesystem fallback (POSIX: find -newer + grep -l for content-match)
"Bash(find ~/.codex/sessions*)",
// Diagnostic aid when filesystem fallback finds nothing
"Bash(ls -t ~/.codex/sessions*)",
// Temp files: prompts (initial + resume), plans, review output, JSONL stdout, stderr
"Write(/tmp/codex-plan-*)",
"Write(/tmp/codex-prompt-*)",
"Write(/tmp/codex-resume-prompt-*)",
"Read(/tmp/codex-review-*)",
"Read(/tmp/codex-stdout-*)",
"Read(/tmp/codex-stderr-*)",
// Archive failed-resume diagnostics before fresh exec overwrites them
"Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
"Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
// Cleanup
"Bash(rm -f /tmp/codex-*)",
// Main thread: write prompt body for the runner subagent (NEW in refactor)
"Write(/tmp/codex-body-*)",
// Main thread: read the structured JSON result returned by the runner (NEW)
"Read(/tmp/codex-runner-result-*)",
// Runner subagent (inherited): read the prompt body main wrote (NEW)
"Read(/tmp/codex-body-*)",
// Runner subagent (inherited): write the result JSON main reads (NEW)
"Write(/tmp/codex-runner-result-*)",
// Runner-spec discovery: tier 1 (user-scoped install)
"Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
// Runner-spec discovery: tier 2 (plugin-marketplace cache)
"Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"
Full example (if the config file is empty or does not exist)
{
  "permissions": {
    "allow": [
      // adversarial-review
      "Bash(git diff*)",
      "Bash(git status*)",
      "Bash(git symbolic-ref*)",
      "Bash(git rev-parse*)",
      "Bash(cat /tmp/codex-prompt-* | timeout 600 codex exec *)",
      "Bash(cd * && cat /tmp/codex-resume-prompt-* | timeout 600 codex exec resume *)",
      "Bash(find ~/.codex/sessions*)",
      "Bash(ls -t ~/.codex/sessions*)",
      "Write(/tmp/codex-plan-*)",
      "Write(/tmp/codex-prompt-*)",
      "Write(/tmp/codex-resume-prompt-*)",
      "Read(/tmp/codex-review-*)",
      "Read(/tmp/codex-stdout-*)",
      "Read(/tmp/codex-stderr-*)",
      "Bash(mv /tmp/codex-stdout-* /tmp/codex-stdout-*-failed-resume.jsonl)",
      "Bash(mv /tmp/codex-stderr-* /tmp/codex-stderr-*-failed-resume.txt)",
      "Bash(rm -f /tmp/codex-*)",
      "Write(/tmp/codex-body-*)",
      "Read(/tmp/codex-runner-result-*)",
      "Read(/tmp/codex-body-*)",
      "Write(/tmp/codex-runner-result-*)",
      "Bash(ls ~/.claude/skills/adversarial-review/references/runner.md*)",
      "Bash(ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md*)"
    ]
  }
}

Security note: The codex exec rule allows any codex exec invocation wrapped in timeout 600. The skill only uses read-only mode (-s read-only), but Claude Code's permission patterns are prefix-based and cannot enforce flag constraints. If you prefer tighter control, omit the codex exec rule and approve each invocation manually.

4. Use

/adversarial-review                       # auto-detect mode
/adversarial-review plan                  # force plan review
/adversarial-review code                  # force code review
/adversarial-review path/to/f             # review a specific file
/adversarial-review xhigh                 # higher reasoning effort
/adversarial-review model:gpt-5.3-codex   # use a different model

Prompt architecture

The skill uses XML-structured prompts with adversarial stance:

  • <role> — adversarial reviewer, defaults to skepticism
  • <operating_stance> — break confidence, not validate
  • <attack_surface> — concrete checklist: auth, data integrity, race conditions, rollback safety, schema drift, error handling, observability
  • <finding_bar> — every finding must answer 4 questions: what can go wrong, why vulnerable, impact, recommendation
  • <scope_exclusions> — no style, naming, or speculative comments
  • <calibration> — one strong finding > five weak ones

Example output

See examples/review-output.md for a sample review.

Troubleshooting

codex exec exits with model error. Some models are unavailable with ChatGPT accounts (e.g. o3-mini). The default gpt-5.4 works with both ChatGPT and API key auth. Override with /adversarial-review model:<name>.

Permission prompts on every action. Add the permissions from the setup section. Check that the file is valid JSON and in the right location (project .claude/settings.local.json or global ~/.claude/settings.json).

Codex hangs / timeout (exit code 124). All codex exec calls are wrapped in timeout 600 (10 minutes). If you see exit code 124, the reviewer did not respond in time. Retry — this is usually transient.

Resume fails with session error. The skill uses codex exec resume <session-id> for rounds 2+. On failure (non-zero exit, thread/resume failed in stderr, or a malformed review), the skill does NOT silently fall back. In an interactive session it asks whether to run a fresh codex exec (higher token cost) or conclude the review as NOT VERIFIED. In headless runs it decides based on the maximum severity of the last successful round's findings: critical/high → fresh exec; medium-only → conclude as NOT VERIFIED.

--json stdout is empty in my Claude Code session (session ID capture noise). In some Claude Code sandbox configurations codex's --json event stream is suppressed when stdout is redirected to a file — the /tmp/codex-stdout-*.jsonl ends up 0 bytes even though the review itself (-o /tmp/codex-review-*.md) completes correctly. The skill handles this automatically via a filesystem fallback: every prompt includes a per-launch session marker (<!-- ADVERSARIAL-REVIEW-SESSION: <REVIEW_ID>-<ATTEMPT_ID> -->) where ATTEMPT_ID is a fresh random integer regenerated for the initial exec, every retry, every resume, and every fresh-exec fallback. The marker is written to the rollout JSONL on disk. When the JSONL stream is empty, the skill runs find ~/.codex/sessions -name 'rollout-*.jsonl' -newer <prompt-file> -exec grep -l <REVIEW_ID>-<ATTEMPT_ID> {} + to positively identify this specific launch's rollout (not by newest-mtime, which would be unsafe against parallel codex; not by review-stable ID alone, which would match stale retry rollouts) and extracts the UUID from the filename. Zero or multiple matches → the skill fails closed with a diagnostic rather than silently picking. Resume continues to work normally. All commands are POSIX (find -newer, -exec grep -l) and work identically on Linux and macOS.

"NOT VERIFIED" result. The skill applied fixes but the reviewer did not re-verify them (resume failed or the operator chose to conclude). This is not an approval — manually review the applied fixes before merging.

Running inside a git submodule. git rev-parse --show-toplevel returns the submodule path, not the parent repo. The skill warns you and scopes the review to the submodule. If you meant to review the parent, invoke the skill from the parent working tree.

Bare repository or not inside a work tree. The skill aborts at Step 2 with a clear message. Run it from inside a git working tree.

Plan Mode exits when writing temp files. In Claude Code Plan Mode, writing to /tmp may trigger a permission prompt or exit Plan Mode. This is a known Claude Code limitation. It does not affect review correctness.

Known limitations

  • Plan Mode and /tmp writes. Writing review prompts to /tmp may trigger a permission prompt or cause Plan Mode to exit. Does not affect review correctness.
  • resume inherits sandbox. codex exec resume does not accept -s — sandbox is inherited from the original session (always read-only).
  • resume has no -C flag. The skill captures REPO_ROOT via git rev-parse --show-toplevel at Step 2 and prefixes every resume with cd '<REPO_ROOT>' && .... This requires paths without single quotes; pathological paths (containing ', ", $, backtick, newline) cause the skill to abort at Step 2.
  • Submodule scoping. When invoked inside a submodule, the review is scoped to the submodule — git rev-parse --show-toplevel does not walk up to the parent. A warning is printed; invoke from the parent repo if you want parent scope.
  • macOS end-to-end not tested. The secondary session-id capture uses only POSIX flags (find -newer FILE, -exec CMD {} +, grep -l), so it should work identically on macOS as on Linux, but the skill has not been end-to-end tested on macOS.

Roadmap

  • Gemini as alternative reviewer backend
  • Local model support (Ollama, llama.cpp)
  • CI integration (GitHub Actions)
  • Multi-reviewer mode (parallel review by multiple models)

Inspiration

Adversarial prompt structure developed after studying openai/codex-plugin-cc (Apache-2.0).

Borrowed ideas: XML-structured prompts, adversarial stance, attack surface checklist, finding bar, calibration rules.

What we did differently:

  • Iterative loop — Claude fixes issues and resubmits (not "stop and ask user")
  • Plan review — reviews plans before code, not just code
  • Minimal installSKILL.md + one references/runner.md, no server, no broker, vs 15+ JS modules
  • Verbatim output — reviewer findings shown as-is, not rephrased

License

Apache-2.0 — see LICENSE.

About

Claude Code skill for adversarial AI code & plan review. One AI writes, another tears it apart. Iterative fix loop until approved.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors