aunraza19 · ibadurrehmandg · Feb 7, 2026 · Feb 8, 2026 · Feb 8, 2026 · Copilot
diff --git a/README.md b/README.md
@@ -0,0 +1,152 @@
+# SpecGap Documentation
+
+## Project Overview
+SpecGap is a two-part application:
+- Frontend: A React + Vite UI that lets users upload documents, run audits, and review findings.
+- Backend: A FastAPI service that parses documents, runs multi-agent analysis with Gemini via LangGraph, and returns structured audit results and patch packs.
+
+The backend supports three modes:
+- Council session (fast flashcards)
+- Deep analysis (tech + legal + synthesis)
+- Full spectrum (council + deep analysis together)
+
+## Folder Structure
+- Frontend/
+  - Vite + React UI, API client, pages, layout components, and UI primitives.
+- specgap/
+  - Python backend (FastAPI), AI workflows, parsers, and database layer.
+- and prompt templates
+  - Miscellaneous folder (not referenced in code paths).
+- test.json
+  - Standalone file (not referenced in code paths).
+
+### Frontend Highlights
+- Frontend/src/App.tsx: Route setup and application shell.
+- Frontend/src/api/client.ts: API client for backend calls.
+- Frontend/src/pages/: Core screens (Upload, Audits, Results, Search, etc.).
+- Frontend/src/components/: Layout, audit UI, and reusable UI components.
+
+### Backend Highlights
+- specgap/app/main.py: FastAPI app and API endpoints.
+- specgap/app/services/workflow.py: Council multi-agent workflow (LangGraph).
+- specgap/app/services/parser.py: Document parsing (PDF, DOCX, TXT/MD, OCR).
+- specgap/app/services/tech_engine.py: Tech gap analyzer.
+- specgap/app/services/biz_engine.py: Legal/negotiation analyzer.
+- specgap/app/services/cross_check.py: Orchestrator synthesis.
+- specgap/app/services/patch_pack.py: Output file generation.
+- specgap/app/core/database.py: SQLAlchemy models + persistence.
+
+## Architecture and Data Flow
+1. Frontend upload (React UI) sends files via multipart form-data to FastAPI.
+2. Parser extracts text from PDF/DOCX/TXT/MD; OCR is attempted if needed.
+3. Council session (LangGraph):
+   - Round 1: Independent agent drafts (legal, business, finance).
+   - Round 2: Cross-check peer drafts.
+   - Round 3: Generate flashcards.
+4. Deep analysis (optional):
+   - Tech gap analysis (architect agent).
+   - Legal leverage analysis (lawyer agent).
+   - Cross-check synthesis + Mermaid diagram output.
+5. Patch pack can be generated from selected cards (contract addendum, spec update, negotiation email).
+
+## Installation and Setup
+
+### Backend (Python)
+Requirements are listed in specgap/requirements.txt.
+
+```bash
+cd specgap
+python -m venv .venv
+. .venv/Scripts/Activate
+pip install -r requirements.txt
+```
+
+### Frontend (Node)
+Dependencies are managed via npm in Frontend/package.json.
+
+```bash
+cd Frontend
+npm install
+```
+
+## Environment Variables
+
+### Backend
+Loaded via python-dotenv in specgap/app/core/config.py.
+
+- GEMINI_API_KEY (required): Google Gemini API key.
+- DATABASE_URL (optional): Overrides SQLite DB path.
+
+Example .env:
+```
+GEMINI_API_KEY=your_key_here
+DATABASE_URL=sqlite:///./specgap_audits.db
+```
+
+### Frontend
+Defined in Vite and read in Frontend/src/api/client.ts.
+
+- VITE_API_URL (optional): Base API URL. Defaults to /api which proxies to http://localhost:8000 in dev via Frontend/vite.config.ts.
+
+Example .env:
+```
+VITE_API_URL=http://localhost:8000
+```
+
+## How to Run Locally
+
+### Start Backend
+```bash
+cd specgap
+python run_backend.py
+```
+Default: http://localhost:8000
+
+### Start Frontend
+```bash
+cd Frontend
+npm run dev
+```
+Default: http://localhost:8080
+
+The dev server proxies /api to http://localhost:8000 automatically.
+
+## API Endpoints
+
+Implemented in specgap/app/main.py:
+
+### Health
+- GET /
+  - Returns status and architecture info.
+
+### Council Session
+- POST /audit/council-session
+  - Query: domain (optional, default Software Engineering)
+  - Body: multipart form-data with files
+  - Response: flashcards (council_verdict)
+
+### Patch Pack Generator
+- POST /audit/patch-pack
+  - Body: JSON { selected_cards: [...], domain?: string }
+  - Response: generated files (Contract_Addendum.txt, Spec_Update.md, Negotiation_Email.txt)
+
+### Deep Analysis
+- POST /audit/deep-analysis
+  - Query: domain
+  - Body: multipart form-data with files
+  - Response: tech_audit, legal_audit, executive_synthesis
+
+### Full Spectrum Analysis
+- POST /audit/full-spectrum
+  - Query: domain
+  - Body: multipart form-data with files
+  - Response: council verdict + deep analysis bundle
+
+Note: The frontend client references additional endpoints (audits listing, comments, vector search) in Frontend/src/api/client.ts, but those routes are not present in the backend at this time.
+
+## Contribution Guidelines
+- Keep frontend code in Frontend/src/ with TypeScript, React, and Tailwind conventions.
+- Keep backend code in specgap/app/ and follow async FastAPI patterns.
+- Favor new endpoints and services in clearly named modules under specgap/app/services/.
+- Update environment variable docs whenever introducing new config keys.
+- Add unit tests where possible (frontend uses Vitest; backend currently has no test harness).
diff --git a/specgap/app/core/prompts.py b/specgap/app/core/prompts.py
@@ -1,3 +1,15 @@
+COUNCIL_PERSONAS = {
+    "legal": {
+        "role": "Corporate General Counsel",
+        "focus": "Liability, IP, termination, hidden contract traps",
+    },
+    "business": {
+        "role": "Chief Operating Officer (COO)",
+        "focus": "Feature completeness, operational viability, timeline realism",
+    },
+    "finance": {
+        "role": "CFO & Audit Partner",
+        "focus": "Costs, payment terms, ROI, financial risks",
 """
 Council Prompts for SpecGap
 Defines personas and prompt templates for the 3-round deliberation.
@@ -23,6 +35,26 @@
 
 PROMPT_TEMPLATES = {
     "ROUND_1": """
+Role: {role}
+Domain: {domain}
+Task: Identify Risks/Gaps in the provided documents (Contract + Tech Spec).
+Focus: {focus}
+Output: JSON only
+Instructions:
+- Cite exact text for every finding.
+- Classify gaps as Critical / High / Medium / Low
+- Optional: Include "suggested_fix" if obvious.
+Format:
+{{
+  "findings": [
+    {{
+      "title": "...",
+      "description": "...",
+      "severity": "Critical|High|Medium|Low",
+      "source": "File Name / Section",
+      "suggested_fix": "..."
+    }}
+  ]
 You are acting as: {role}
 Domain Context: {domain}
 
@@ -55,6 +87,55 @@
 """,
 
     "ROUND_2": """
+Role: {role}
+Domain: {domain}
+Task: Update your findings using peer feedback.
+[Your Draft]: {current_draft}
+[Peers Drafts]: {peer_drafts}
+Output: JSON only
+Instructions:
+- Merge missing findings from peers
+- Resolve contradictions: keep the one with higher severity
+- Retain source references
+Format same as ROUND_1
+""",
+
+    "ROUND_3": """
+Role: {role}
+Domain: {domain}
+Task: Convert findings into actionable Flashcards.
+[Analysis]: {current_draft}
+[Peer Insights]: {peer_drafts}
+Output: JSON only
+Instructions:
+- Max 3-5 flashcards per persona
+- Provide:
+    - id: unique identifier
+    - card_type: "Risk" | "Opportunity"
+    - title: short headline
+    - description: concise explanation (1-2 sentences)
+    - fix_action: what user should do
+    - severity: Critical / High / Medium / Low
+    - impact: High / Medium / Low (for prioritization)
+    - swipe_right_payload: exact text/action if user accepts
+- Do not add extra text or commentary
+Format:
+{{
+  "flashcards": [
+    {{
+      "id": "...",
+      "card_type": "...",
+      "title": "...",
+      "description": "...",
+      "fix_action": "...",
+      "severity": "...",
+      "impact": "...",
+      "swipe_right_payload": "..."
+    }}
+  ]
+}}
+"""
+}
 You are acting as: {role}
 Domain Context: {domain}
 

diff --git a/specgap/app/services/biz_engine.py b/specgap/app/services/biz_engine.py
@@ -4,6 +4,124 @@
 """
 
 import json
+from datetime import datetime
+from typing import Dict, Any, List
+from app.core.config import model_text
+
+# -----------------------------
+# Schema guard
+# -----------------------------
+REQUIRED_KEYS = {
+    "leverage_score": int,
+    "favor_direction": str,
+    "trap_clauses": list,
+    "negotiation_tips": list
+}
+
+def log_step(step: str):
+    print(f"[{datetime.now().isoformat()}] {step}")
+
+def validate_and_fix(output: dict) -> dict:
+    """Ensure required keys exist and values are valid."""
+    fixed = {}
+    for key, key_type in REQUIRED_KEYS.items():
+        if key not in output:
+            fixed[key] = [] if key_type == list else None
+        else:
+            fixed[key] = output[key]
+
+    # Clamp leverage score
+    if isinstance(fixed["leverage_score"], int):
+        fixed["leverage_score"] = max(0, min(100, fixed["leverage_score"]))
+
+    # Normalize favor direction
+    if fixed["favor_direction"] not in ["Vendor", "Client", "Neutral"]:
+        fixed["favor_direction"] = "Neutral"
+
+    return fixed
+
+def chunk_text(text: str, max_len: int = 40000) -> List[str]:
+    """Split very large proposals into manageable chunks."""
+    return [text[i:i+max_len] for i in range(0, len(text), max_len)]
+
+# -----------------------------
+# Main Function
+# -----------------------------
+async def analyze_proposal_leverage(proposal_text: str, retries: int = 2) -> Dict[str, Any]:
+    """
+    Legal Audit / Negotiation Agent:
+    Detect leverage, hidden risks, and negotiation tips.
+    Handles large proposals, JSON drift, and retry on failure.
+    """
+    log_step("Preparing system prompt for Legal Audit")
+
+    system_prompt = """
+You are SpecGap, a ruthless corporate lawyer.
+
+TASK:
+Audit provided business documents (may contain multiple files).
+
+GOALS:
+1. Check if Proposal meets Requirements.
+2. Score leverage (0–100).
+3. Detect hidden or dangerous clauses.
+4. Provide exact redline text for High or Critical risks.
+
+RULES:
+- Cite exact clause text.
+- Do not invent clauses.
+- If no risks exist, return empty arrays.
+- Redline text must be legally enforceable.
+- This is a hypothetical risk analysis, not legal advice.
+
+SEVERITY RUBRIC:
+Critical = unlimited liability, IP ownership transfer, uncapped indemnity
+High = asymmetric termination, vague scope, jurisdiction mismatch
+Medium = missing SLAs, unclear payments
+Low = ambiguity only
+
+OUTPUT JSON ONLY:
+{
+  "leverage_score": 0-100,
+  "favor_direction": "Vendor|Client|Neutral",
+  "trap_clauses": [...],
+  "negotiation_tips": ["..."]
+}
+"""
+
+    # Chunk if text is too long
+    chunks = chunk_text(proposal_text)
+
+    # Combine prompt + chunks
+    prompts = [f"{system_prompt}\n\n--- DOCUMENTS (chunk {i+1}) ---\n{chunk}" 
+               for i, chunk in enumerate(chunks)]
+    full_prompt = "\n".join(prompts) if len(prompts) > 1 else prompts[0]
+
+    attempt = 0
+    while attempt <= retries:
+        try:
+            log_step(f"Calling model_text.generate_content_async (attempt {attempt+1})")
+            response = await model_text.generate_content_async(full_prompt)
+
+            cleaned = response.text.strip()
+            if cleaned.startswith("```"):
+                cleaned = cleaned.split("```")[1]
-                cleaned = cleaned.split("```")[1]
+                # Remove opening fence and optional language identifier (e.g. ```json)
+                if cleaned.startswith("```json"):
+                    cleaned = cleaned[len("```json"):].strip()
+                else:
+                    cleaned = cleaned[3:].strip()
+                # Remove trailing fence if present
+                if cleaned.endswith("```"):
+                    cleaned = cleaned[:-3].strip()
-                cleaned = cleaned.split("```")[1]
+                # Remove opening fence and optional language identifier (e.g. ```json)
+                if cleaned.startswith("```json"):
+                    cleaned = cleaned[len("```json"):].strip()
+                else:
+                    cleaned = cleaned[3:].strip()
+                # Remove trailing fence if present
+                if cleaned.endswith("```"):
+                    cleaned = cleaned[:-3].strip()
+
+            parsed = json.loads(cleaned)
+            return validate_and_fix(parsed)
+
+        except json.JSONDecodeError:
+            log_step("JSON parse failed, returning raw output snippet")
+            return {
+                "error": "Model output was not valid JSON",
+                "raw_output": response.text[:1500]
+            }
-            log_step("JSON parse failed, returning raw output snippet")
-            return {
-                "error": "Model output was not valid JSON",
-                "raw_output": response.text[:1500]
-            }
+            log_step(f"JSON parse failed on attempt {attempt+1}")
+            attempt += 1
+            if attempt > retries:
+                log_step("Max retries reached after JSON parse failures, returning raw output snippet")
+                return {
+                    "error": "Model output was not valid JSON",
+                    "raw_output": response.text[:1500]
+                }
-            log_step("JSON parse failed, returning raw output snippet")
-            return {
-                "error": "Model output was not valid JSON",
-                "raw_output": response.text[:1500]
-            }
+            log_step(f"JSON parse failed on attempt {attempt+1}")
+            attempt += 1
+            if attempt > retries:
+                log_step("Max retries reached after JSON parse failures, returning raw output snippet")
+                return {
+                    "error": "Model output was not valid JSON",
+                    "raw_output": response.text[:1500]
+                }
+
+        except Exception as e:
+            log_step(f"Attempt {attempt+1} failed: {e}")
+            attempt += 1
+
+    return {"error": "Proposal leverage analysis failed after retries"}
 import asyncio
 from typing import Dict, Any