Skip to content

obviousworks/agentic-coding-meta-prompt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

🧬 The Ultimate Agentic Coding Meta-Prompt

Maintained by Obvious Works License: MIT AI Powered

The "Holy Grail" of System Prompts. A synthesis of the world's best AI IDE instructions, distilled by Gemini Pro & Claude, stress-tested against real-world expert setups, and continuously refined by Obvious Works.


📑 Table of Contents


🚀 Introduction & Mission

We are at the dawn of a new era in software development: Agentic Coding. Tools like Cursor, Windsurf, and Cline are revolutionizing how code is written. However, the quality of the output is only as good as the system instructions (prompts) guiding the model.

At Obvious Works, we asked ourselves: What happens if you aggregate the collective intelligence of the world's best AI coding tools and force them to compete against each other?

We undertook the effort to sift through hundreds of "Master Prompts," analyze them, and reverse-engineer a Meta-Master-Prompt that combines every advantage while eliminating the weaknesses.

This repository documents that process — including a second iteration where we benchmarked our own APEX prompt against the real-world setup of one of the most respected practitioners in the field.


🧠 Why Prompts Matter in Agentic Coding

In "Agentic Software Development," the LLM is no longer just a chatbot; it is an autonomous agent reading file systems, executing commands, and debugging errors.

Without precise System Prompts, models tend to exhibit:

  • Laziness: Omitting code blocks or using placeholders like // implementation here.
  • Hallucinations: Inventing syntax or referencing non-existent libraries.
  • Context Drift: Losing track of the overall project architecture.
  • Amnesia: Repeating the same mistakes across sessions with no self-correction.

A high-quality prompt acts as a Cognitive Operating System for the agent. It defines boundaries, enforces planning phases (Chain of Thought), and guarantees code quality that meets industrial standards.


🧪 The Methodology: How We Built It

Our approach was radically data-driven. Here is the workflow we executed to create this prompt:

  1. Data Mining: We utilized the incredible repository system-prompts-and-models-of-ai-tools as our ground truth.
  2. Aggregation: Using Google Antigravity, we extracted every available master prompt and consolidated them into a massive single text file (approx. 1.5 Megabytes of plain text).
  3. Meta-Analysis Prompting: We asked Google Antigravity to write a specific analysis prompt designed to deconstruct this massive dataset.
  4. The LLM Analysis:
    • Run 1: Uploaded the 1.5 MB dataset to Google Gemini Pro with instructions to extract the ultimate Master Prompt.
    • Run 2: Executed the exact same process with Claude Opus (Thinking Model + Knowledge Base).

The Goal: To generate two competing visions of the "Perfect Prompt."


⚔️ The Showdown: Gemini vs. Claude

The results from the two models were fascinatingly distinct, representing two different philosophies of "High-Level Prompt Engineering." We then fed these two resulting prompts back into Gemini Pro 1.5 to run a comparative analysis and final synthesis.

Here are the contenders:

  • Prompt 1 (Codename: APEX): Generated by the Gemini lineage. Focuses on strict structure and XML.
  • Prompt 2 (Codename: THE ARCHITECT): Generated by the Claude lineage. Focuses on cognitive processes and reasoning.

🔍 The Analysis: APEX vs. THE ARCHITECT

Here is the summary of the comparative analysis that led to our first synthesis:

1. Structure and Formatting

  • APEX: Utilizes a strict XML Tag Structure (<system_identity>, <core_mandates>).
    • Advantage: Modern LLMs (Claude 4.5, GPT-5x) excel with XML. It sets clear semantic boundaries, allowing the model to know exactly where a rule begins and ends.
  • THE ARCHITECT: Uses classic Markdown headers.
    • Disadvantage: These are "softer" boundaries compared to hard XML tags, leading to occasional instruction bleed.

2. Chain of Thought (CoT)

  • APEX: Describes a cognitive framework but does not enforce a specific output format for thinking.
  • THE ARCHITECT: Mandates an <architect_thought> XML tag before every action.
    • Advantage: The Killer Feature. When a model is forced to write out its plan before generating code, error rates in complex logic tasks drop drastically.

3. Anti-Laziness & Quality

Both prompts have strong protocols against "lazy coding."

  • APEX: More detailed regarding Naming Conventions, Testing, and Security protocols.
  • THE ARCHITECT: Extremely strong on editing mechanics (Context Matching), which prevents hallucinations during "Search & Replace" operations.

4. Flexibility

  • APEX: Features <response_mode_adaptation>. It distinguishes between "Lightweight Mode" (Chat) and "Full Engineering Mode."
  • THE ARCHITECT: Is always in "Full Mode," which can create overhead for simple queries.

🥊 Round 2: APEX vs. The Real World

After publishing APEX v1, we ran a second experiment — this time benchmarking our synthesized prompt against the actual CLAUDE.md setup used by Boris Cherny, the creator of Claude Code at Anthropic.

Boris's setup is radically different from APEX: it is roughly 10x shorter (~250 words vs. ~2,500), but laser-focused on six principles that address real failure modes in long-running agentic sessions.

What Boris's Setup Does Better

Capability APEX v1 Boris's claude.md
Subagent Orchestration ❌ Absent ✅ Explicit strategy
Persistent Task Tracking ❌ In-context only tasks/todo.md
Self-Improvement Loop ❌ None tasks/lessons.md
Re-planning on Failure ⚠️ 3 retries ✅ Explicit STOP signal
Elegance Check ❌ Not enforced ✅ Built into workflow
Autonomous Bug Fixing ⚠️ Implied ✅ Explicit

What APEX Does Better

Capability APEX v1 Boris's claude.md
Tool Usage Rules ✅ Detailed ❌ None
Security & Privacy ✅ Explicit ❌ None
Naming Conventions ✅ Concrete examples ❌ Abstract only
Response Mode Adaptation ✅ Lightweight vs. Full ❌ Always full mode
Communication Protocol ✅ With anti-patterns ❌ Not defined
Cognitive Framework <cognitive_thought> ❌ Not enforced

Key Insight

Boris's prompt wins on adaptivity and persistence — the agent learns from mistakes and externalizes state into files that survive context resets. APEX v1 wins on completeness and precision — but that completeness comes at a cost: cognitive overload, where critical rules get diluted inside 2,500 words.

Neither prompt alone is optimal. The hybrid is.


💎 The Final Result: APEX Hybrid

APEX_AGENT.md is the result of both synthesis rounds. It is a chimera that takes the best of three sources:

  1. The Skeleton of APEX v1: Tool rules, security, naming conventions, response modes, cognitive framework.
  2. The Brain of THE ARCHITECT: Mandatory <cognitive_thought> tags for enforced reasoning before every action.
  3. The Operational Discipline of Boris: Subagent strategy, persistent task files, self-improvement loop, explicit STOP signals, elegance checks, and autonomous bug fixing.

What's New in the Hybrid

  • tasks/todo.md — Every non-trivial task starts with a written plan, checked in before implementation, marked complete on delivery.
  • tasks/lessons.md — After any correction, the agent writes a rule preventing that mistake. Reviewed at the start of every session.
  • Subagent Strategy — Main context window stays clean. Research, exploration, and parallel analysis are offloaded.
  • Elegance Gate — Before presenting any non-trivial fix: "Is there a more elegant solution?"
  • STOP Signal — If something goes sideways, the agent stops and re-plans. No more pushing through broken logic.
  • Autonomous Bug Fixing — Given a bug report, the agent fixes it. No hand-holding, no confirmation loops.
  • 40% shorter than APEX v1 — same coverage, higher signal density.

Design Philosophy

A great agent prompt is not a legal document. It is a set of habits — precise enough to be followed, short enough to be remembered, and adaptive enough to improve over time.


📦 Installation & Usage

This prompt is optimized for use in Cursor (.cursorrules), Windsurf, Cline, or as a CLAUDE.md file for Claude Code.

For Cursor AI / Windsurf

  1. Copy the content of APEX_AGENT.md from this repo.
  2. Create or open .cursorrules in your project root.
  3. Paste the content and save.

For Claude Code

  1. Place APEX_AGENT.md in your project root and rename it CLAUDE.md.
  2. Claude Code will automatically read it as its operating instructions.

For Custom GPTs / Claude Projects

  1. Paste the prompt into the "System Instructions" or "Project Instructions" field.

First-Run Setup

On the first run, ask your agent to initialize the task management structure:

Create tasks/todo.md and tasks/lessons.md in this project.

The agent will maintain these files autonomously from that point forward.


👏 Credits & Acknowledgments

This project stands on the shoulders of giants.

  • Analysis & Synthesis: The team at Obvious Works.
  • Original Data Source: A massive thank you to x1xhlol for the foundational dataset that made our meta-analysis possible.
  • Round 2 Reference: Thanks to Boris Cherny — creator of Claude Code at Anthropic — for sharing his real-world CLAUDE.md setup, which exposed the critical gaps in APEX v1 around persistence, subagents, and self-improvement.

Disclaimer: This prompt is powerful and detailed. It may increase token costs per session, but it delivers consistent Senior Developer-level output and — uniquely — gets measurably better over time through its self-improvement loop.

About

The ultimate Meta-Prompt for Agentic Coding (Cursor, Windsurf, Cline). Synthesized from 1.5MB of top global system prompts analyzed by Gemini & Claude. Combines APEX XML structure with ARCHITECT deep reasoning for zero-laziness & senior-level code. Engineered by Obvious Works.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors