Skip to content

logmatter/langgraph-mcp-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LangGraph MCP Agent — Assignment 2 Solution

A LangGraph agent that interacts with a mock file system and database via MCP tools, with Tool Failure Recovery and Human-in-the-Loop (HITL) gating for destructive actions.


Architecture

User Input
    │
    ▼
┌─────────┐   tool_call / approval     ┌───────────────┐
│ Planner │ ─────────────────────────► │ Tool Executor │
│  (LLM)  │ ◄────────────────────────  │  (MCP tools)  │
└─────────┘   result / error (retry)   └───────────────┘
    │                                         ▲
    │ request_approval                        │ approved
    ▼                                         │
┌───────────┐  approved  ┌─────────────────────────┐
│ HITL Gate │ ──────────►│  (back to tool executor) │
│  (human)  │            └─────────────────────────┘
└───────────┘
    │ denied
    ▼
  [END — denied answer]

Key Design Decisions

Concern Solution
Tool errors Retry loop (max 3) with LLM reflection prompt
Destructive actions HITL gate node — agent pauses, human approves/denies
Unknown tools Caught at executor, returned as typed error
No API key Stub LLM for offline/CI testing
Auditability Every step logged in state["messages"]

Project Structure

langgraph-agent/
├── agent/
│   ├── graph.py       # LangGraph graph: nodes, edges, routing
│   ├── state.py       # Typed AgentState (TypedDict)
│   └── prompts.py     # LLM prompt templates
├── mcp_server/
│   └── mock_mcp.py    # Mock file system + DB tools
├── tests/
│   └── test_agent.py  # Pytest suite (unit + integration)
├── examples/
│   ├── input.json     # Sample user inputs
│   └── output.json    # Expected outputs with traces
├── main.py            # CLI entry point
├── requirements.txt
└── .env.example

Setup

# 1. Clone / enter the repo
cd langgraph-agent

# 2. Create a virtual environment
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment
cp .env.example .env
# Edit .env — set OPENAI_API_KEY (or leave as sk-fake for stub mode)

Running the Agent

# Happy path — read an existing file
python main.py "show me the q1 sales report"

# Retry loop — wrong path triggers reflection + search fallback
python main.py "read /wrong/path/sales.csv"

# HITL gate — destructive action pauses for human approval
python main.py "delete the q1 sales file"

# Database query
python main.py "show me all users in the database"

Example inputs and outputs

The examples/ folder contains sample payloads and expected response shapes:

File Description
examples/input.json Array of sample user prompts (strings) you can pass to the agent.
examples/output.json Example outputs with traces: status, final_answer, and a messages_trace for each scenario (happy path, retry/recovery, HITL deny, DB query).

Try an input from the file:

# Using the first example input
python main.py "show me the q1 sales report"

# Or run each line from input.json (e.g. in a script)
# inputs: "show me the q1 sales report", "read /wrong/path/sales.csv", "delete the q1 sales file", "show me all users in the database"

The structure of the agent’s final state (what you see in output.json under each example) is: status, final_answer, optional retry_count / approval_granted, and messages (full trace of user, assistant, and tool messages).


Running Tests

# Run all tests (no API key needed — uses stub LLM)
pytest tests/ -v

# Run only MCP unit tests
pytest tests/test_agent.py::TestMockMCP -v

# Run only integration tests
pytest tests/test_agent.py::TestAgentIntegration -v

Feature Deep-Dives

1. Tool Failure Recovery (Retry + Reflection Loop)

When a tool call fails, the agent doesn't give up immediately:

  1. tool_executor_node captures the error and increments retry_count
  2. Router checks: if retry_count < MAX_RETRIES (3), sends back to planner
  3. planner_node uses REFLECTION_PROMPT — tells the LLM what failed and asks it to reason about an alternative
  4. LLM typically recovers by calling search_files after a FileNotFoundError
  5. After MAX_RETRIES, error_handler_node produces a graceful failure message
read_file("/wrong/path") → FileNotFoundError
    → LLM reflects: "path was wrong, try searching"
    → search_files("sales") → ["/data/reports/q1_sales.csv", ...]
    → final_answer: "Found these files instead: ..."

2. Human-in-the-Loop (HITL) Gate

Destructive tools (delete_file, update_db, delete_db_record) never run automatically:

  1. LLM uses "action": "request_approval" instead of "action": "tool_call"
  2. planner_node detects pending_approval=True and routes to hitl_gate_node
  3. hitl_gate_node prints the action details and waits for human input
  4. Approved → routes back to tool_executor_node to execute
  5. Denied → sets final_answer with denial reason, routes to END

For CI/automated tests, set HITL_AUTO_APPROVE=true or HITL_AUTO_DENY=true.


Tradeoffs & Shortcuts

  • Stub LLM: The _stub_llm function uses keyword matching for offline testing. With a real OPENAI_API_KEY the actual GPT-4o-mini model is used.
  • In-memory state: The mock file system and DB are module-level dicts — they reset between process runs but are shared within a run. A production version would use persistent storage.
  • Synchronous HITL: The approval prompt is blocking stdin. Production would use an async approval queue (e.g., Slack bot, web UI, email).
  • Single-agent: This is a single-LLM agent. A production system might use specialized sub-agents per tool domain.
  • Max retries = 3: Hardcoded constant; should be configurable via environment variable.

About

LangGraph agent that uses MCP to talk to a mock filesystem and DB, with tool failure recovery (reflection/retry) and a human-in-the-loop gate for destructive actions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages