`agent-planner-executor`

Separating Reasoning from Execution in Agent Systems

Why This Repository Exists

Most agent systems merge decision-making and execution into a single loop.

That design obscures failure modes:

You cannot tell whether a failure came from bad reasoning or bad tool execution
You cannot evaluate planning quality independently of outcomes
You cannot safely add memory, replanning, or observability without compounding ambiguity

Analysis consolidated in rag-systems-foundations and extended in agent-tool-retriever showed that control decisions must be explicit and inspectable.

This repository enforces a hard architectural boundary:

Planning decides what should happen. Execution decides how it happens.

No component is allowed to violate that separation.

Core Design Principle

Reasoning must be inspectable without running tools. Execution must be debuggable without re-reasoning.

If either condition fails, the system is incorrectly designed.

System Architecture

The system is composed of three explicit layers, each with a single responsibility.

User
  ↓
Runtime (orchestration only)
  ↓
Planner → Executor

Planner

Responsibility: Decide what should happen.

The planner:

Consumes:
- the user question
Produces:
- a structured, machine-readable Plan
Guarantees:
- no tool calls
- no execution
- no side effects

The planner is evaluated only on plan quality, not on task success.

Example planner output:

{
  "objective": "What BLEU score did Vaswani et al. report for EN–DE translation?",
  "steps": [
    {
      "step_id": 1,
      "action": "retrieve",
      "args": {"question": "...", "k": 4},
      "rationale": "source-bound factual request"
    }
  ]
}

Executor

Responsibility: Execute the plan exactly as written.

The executor:

Consumes:
- the planner’s plan
Performs:
- only the tool calls specified in the plan
Produces:
- a raw execution trace

Constraints are strict:

❌ no replanning
❌ no goal reinterpretation
❌ no hidden reasoning

The executor is evaluated on faithful adherence, not intelligence.

Runtime

Responsibility: Orchestrate, not decide.

The runtime:

Calls the planner
Passes the plan to the executor
Assembles retrieved context (if any)
Delegates answer generation
Writes structured traces

It is the only layer allowed to:

serialize objects
assemble outputs
write logs

The runtime contains no decision logic.

Repository Structure

agent-planner-executor/
├── planner/
│   ├── planner.py        # Pure plan generation
│   ├── plan_schema.py   # Plan / PlanStep definitions
│   └── __init__.py
│
├── executor/
│   ├── executor.py      # Faithful plan execution
│   └── __init__.py
│
├── runtime/
│   ├── run.py           # Orchestration only
│   └── __init__.py
│
├── response/
│   └── generate.py      # Answer generation (opaque)
│
├── utils/
│   └── logging_utils.py # Trace writing
│
├── logs/
│   └── traces.jsonl     # Structured execution traces
│
└── main.py              # Thin entrypoint

Each module exists for a single reason. No module performs work outside its assigned role.

Behavioral Equivalence

This system was evaluated using the same question set as agent-tool-retriever.

Observed behavior:

Final answers are identical
Retrieval decisions are unchanged
Planner emits:
- noop for parametric questions
- retrieve for source-dependent questions
Executor executes plans faithfully

The only change is observability.

This demonstrates that the planner / executor split is an architectural refactor, not a behavioral modification.

Traces as First-Class Outputs

Each run produces a structured trace containing:

the user question
the planner’s full plan
executor step-by-step execution
the final answer

This makes previously invisible distinctions explicit:

parametric vs evidence-based answers
intentional retrieval skips
exact tool outputs used downstream

Monolithic agent loops collapse these signals. This system preserves them.

Failure Modes Made Visible

This architecture is designed to expose, not mask, failures such as:

logically valid plans that are operationally impossible
over- or under-retrieval decisions
incorrect planner assumptions about tools
silent execution drift from intended plans

In a single-loop agent, these failures are indistinguishable. Here, they are isolatable.

What This System Does Not Include

Deliberately excluded:

memory or persistence
replanning or self-correction
reflection loops
multi-agent coordination
tool optimization or learning

These features are unsafe to add until reasoning and execution are separable.

Why This Matters

In real systems:

compilers are separate from runtimes
query planners are separate from operators
schedulers are separate from workers

Agent systems should follow the same discipline.

This repository treats agents as a systems engineering problem, not a prompt-design exercise.

This architecture is a prerequisite for agent-memory-systems, where persistence and recall are added without collapsing reasoning, control, and execution into a single loop.

Intended Audience

This repository is for readers who prioritize:

debuggability over demos
architecture over cleverness
failure analysis over happy paths

If you want a chatbot, look elsewhere. If you want to understand how agent systems actually break, this is the right place.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`agent-planner-executor`

Why This Repository Exists

Core Design Principle

System Architecture

Planner

Executor

Runtime

Repository Structure

Behavioral Equivalence

Traces as First-Class Outputs

Failure Modes Made Visible

What This System Does Not Include

Why This Matters

Intended Audience

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
decision		decision
executor		executor
logs		logs
planner		planner
response		response
runtime		runtime
tools		tools
utils		utils
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Arnav-Ajay/agent-planner-executor

Folders and files

Latest commit

History

Repository files navigation

agent-planner-executor

Why This Repository Exists

Core Design Principle

System Architecture

Planner

Executor

Runtime

Repository Structure

Behavioral Equivalence

Traces as First-Class Outputs

Failure Modes Made Visible

What This System Does Not Include

Why This Matters

Intended Audience

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`agent-planner-executor`

Packages