modular-patchbay/plane_open_issues.json at master · VictorGjn/modular-patchbay · GitHub

1
[{"body":"## Description\nHealth score panel derived from graph data ÔÇö not file scanning like harness-engineering repo, but dependency-aware metrics from the context graph. Shows codebase readiness as a 0-100 composite score with 6 individual metrics. Includes delta tracking via localStorage.\n\nInspired by: github.com/jrenaldi79/harness-engineering (readiness analysis), but our approach is graph-based, not file-based.\n\n## Files\n- **Created:** \\src/components/ReadinessPanel.tsx\\\n\n## Metrics (computed in graphStore.computeReadiness)\n| Metric | Weight | What |\n|---|---|---|\n| Coverage | 30% | % of files with at least 1 relation |\n| Test Coupling | 25% | % of source files with a tested_by relation |\n| Doc Coupling | 15% | % of directories with a documents relation |\n| Circular Deps | 15% | Count of import cycles (fewer = better) |\n| Hub Concentration | 10% | Max fan-in as % of total nodes (lower = better) |\n| Orphan Files | 5% | Files with 0 relations (fewer = better) |\n\n## UI\n- Score 0-100 with color coding (0-40 red, 41-70 yellow, 71-100 green)\n- Progress bars for each metric\n- Delta arrow (Ôåæ/Ôåô) when previous score exists in localStorage\n- Expandable details: orphan file list, circular dep chains, top hub files\n- Collapsible panel below/beside GraphView\n\n## Acceptance Criteria\n- [ ] Panel renders below graph when scan data exists\n- [ ] Score 0-100 with correct color coding\n- [ ] All 6 metrics display with progress bars\n- [ ] Delta shown when previous score exists in localStorage\n- [ ] Details section expands to show specific files/issues\n- [ ] Panel hidden when no graph data\n- [ ] computeReadiness() produces correct values for sample data\n\n## Dependencies\n- Blocked by: #125 (F4: GraphView v2 ÔÇö needs layout), #121 (F1: store)\n- Blocks: none\n\n## Tests Required\n- ReadinessPanel renders with sample metrics\n- Score color thresholds correct\n- Delta tracking stores/retrieves from localStorage\n- computeReadiness algorithm validation with known inputs","external_id":"I_kwDORY874M716Ekb","labels":["enhancement","graph-v3"],"number":126,"title":"[Graph v3] F5: Readiness Score Panel","url":"https://github.com/VictorGjn/modular-patchbay/issues/126"},{"body":"## Description\nMajor UX upgrade to the existing GraphView component: directory clustering, keyboard navigation, right-click context menu, query highlight mode, and performance optimization for large graphs.\n\n## Files\n- **Modified:** \\src/components/GraphView.tsx\\\n\n## Sub-features\n\n### F4a: Directory Clustering\n- Group nodes by parent directory using d3-force cluster or manual grouping force\n- Subtle background hull per cluster, label at center top\n\n### F4b: Keyboard Navigation\n- Tab/Shift+Tab to cycle nodes, Enter to select, Escape to deselect\n- Arrow keys to navigate between connected nodes (follow edges)\n- \\?\\ to show keyboard shortcut overlay\n\n### F4c: Right-Click Context Menu\n- 'Show neighbors only' ÔÇö filter to 1-hop connections\n- 'Highlight path to...' ÔÇö select target, highlight shortest path\n- 'Copy path' ÔÇö copy file path to clipboard\n- Simple positioned div (no library)\n\n### F4d: Query Highlight Mode\n- Text input: 'Simulate query...'\n- Debounced (300ms) graphStore.query() as you type\n- Depth-colored glow: green (Full), yellow (Summary), dim red (Mention)\n- Non-matching nodes fade to 20% opacity\n- Entry points shown as diamond shapes\n\n### F4e: Performance (1000+ nodes)\n- WebGL renderer when node count \u003e 500\n- Disable labels when zoom \u003c 0.3\n- Throttle hover events\n- Skip force simulation for \u003e 2000 nodes\n\n## Acceptance Criteria\n- [ ] Nodes cluster visually by directory\n- [ ] Keyboard navigation works (Tab, Enter, Escape, arrows)\n- [ ] Right-click shows context menu with 3 options\n- [ ] Query highlight mode shows depth-colored nodes with fading\n- [ ] Graph renders smoothly with 500+ nodes\n- [ ] All existing functionality preserved (filters, search, detail panel, legend)\n\n## Dependencies\n- Blocked by: #123 (F2: KnowledgeTab ÔÇö needs to be mounted first)\n- Blocks: F5\n\n## Tests Required\n- Keyboard event handlers fire correctly\n- Context menu renders on right-click\n- Query highlight mode sets correct node colors\n- Performance: render 1000 nodes without frame drops","external_id":"I_kwDORY874M716ENw","labels":["enhancement","graph-v3"],"number":125,"title":"[Graph v3] F4: GraphView v2 ÔÇö UX Improvements","url":"https://github.com/VictorGjn/modular-patchbay/issues/125"},{"body":"## Description\nWire the existing ContextTrace component into the TestTab context inspector. Shows which files were selected by the graph engine and why ÔÇö entry points, traversal path, depth allocation, token budget breakdown. Appears as a collapsible section only when graph query data exists.\n\n## Files\n- **Modified:** \\src/tabs/TestTab.tsx\\ ÔÇö add collapsible ContextTrace section\n- **Modified:** \\src/components/test/ContextInspector.tsx\\ ÔÇö wire trace data\n\n## Acceptance Criteria\n- [ ] ContextTrace appears in TestTab when graphStore.lastQueryResult exists\n- [ ] Shows entry points with confidence scores\n- [ ] Shows packed files with depth badges (Full/Detail/Summary/Headlines/Mention) and token bars\n- [ ] Budget utilization bar renders correctly with color coding\n- [ ] Section collapses/expands with chevron toggle\n- [ ] When no graph data, section is completely hidden (not empty)\n- [ ] Adapter correctly maps API response to PackedItem types\n\n## Dependencies\n- Blocked by: #121 (F1: Graph Store)\n- Blocks: none\n\n## Tests Required\n- ContextTrace renders with sample query result\n- ContextTrace hidden when no graph data\n- Collapse/expand toggle works\n- Adapter mapping produces correct PackedItem shape","external_id":"I_kwDORY874M716D8h","labels":["enhancement","graph-v3"],"number":124,"title":"[Graph v3] F3: TestTab ContextTrace Integration","url":"https://github.com/VictorGjn/modular-patchbay/issues/124"},{"body":"## Description\nAdd 'Graph' as the 4th tab in KnowledgeTab alongside Local Files, Git Repos, and Connectors. Lazy-load GraphView via React.lazy to avoid bloating initial page load (react-force-graph-2d is ~150KB).\n\n## Files\n- **Created:** \\src/panels/knowledge/GraphPanel.tsx\\ ÔÇö connects graphStore to GraphView\n- **Modified:** \\src/tabs/KnowledgeTab.tsx\\ ÔÇö add 4th tab type + rendering\n\n## Acceptance Criteria\n- [ ] 'Graph' tab visible in KnowledgeTab with Network icon from lucide-react\n- [ ] Clicking 'Graph' tab lazy-loads GraphView (React.lazy + Suspense)\n- [ ] 'Re-index' button triggers scan with active workspace/repo path\n- [ ] Stats bar shows file count, relation count, scan duration\n- [ ] Selecting a node in graph updates GraphDetails panel\n- [ ] Filters (Code/Tests/Markdown/Cross-type) work correctly\n- [ ] Search filters nodes by symbol name\n- [ ] Loading fallback shown during lazy load\n\n## Dependencies\n- Blocked by: #121 (F1: Graph Store)\n- Blocks: F4, F5\n\n## Tests Required\n- GraphPanel renders with empty store\n- GraphPanel renders after scan\n- Tab switching lazy-loads correctly","external_id":"I_kwDORY874M716Dwu","labels":["enhancement","graph-v3"],"number":123,"title":"[Graph v3] F2: KnowledgeTab Graph Integration","url":"https://github.com/VictorGjn/modular-patchbay/issues/123"},{"body":"## Description\nComplete the relation extraction pipeline with two missing extractors: YAML/JSON (configured_by, schema relations) and cross-type (documents relations between markdown and code files).\n\n## Files\n- **Created:** \\src/graph/extractors/yaml.ts\\\n- **Created:** \\src/graph/extractors/cross-type.ts\\\n- **Created:** \\src/graph/__tests__/extractors.test.ts\\\n- **Modified:** \\src/graph/scanner.ts\\ ÔÇö wire new extractors into fullScan/updateFiles\n- **Modified:** \\src/graph/index.ts\\ ÔÇö export new extractors\n\n## Acceptance Criteria\n- [ ] yaml.ts: detects \\configured_by\\ relations from path-like values in YAML/JSON\n- [ ] yaml.ts: detects JSON Schema \\\\\\ as \\schema\\ relation\n- [ ] cross-type.ts: README.md produces \\documents\\ relations to sibling code files\n- [ ] cross-type.ts: markdown backtick mentions (\\\filename.ts\\) create \\documents\\ relations\n- [ ] New extractors run during scan without performance regression (\u003c10% slower)\n- [ ] No false positives for common words\n- [ ] All existing tests still pass\n\n## Dependencies\n- Blocked by: none (parallel with F1)\n- Blocks: F5 (readiness metrics benefit from richer graph)\n\n## Tests Required\n- yaml.ts: tsconfig.json paths, package.json entries, JSON Schema \\\n- cross-type.ts: README ÔåÆ code files, backtick mentions, no false positives","external_id":"I_kwDORY874M716DkF","labels":["enhancement","graph-v3"],"number":122,"title":"[Graph v3] F6: Missing Extractors (YAML + Cross-type)","url":"https://github.com/VictorGjn/modular-patchbay/issues/122"},{"body":"## Description\nState management store for the Context Graph frontend. Manages nodes, relations, scan/query state, readiness metrics, and UI state (selection, highlights). Calls server API endpoints.\n\n## Files\n- **Created:** \\src/store/graphStore.ts\\\n- **Created:** \\src/store/__tests__/graphStore.test.ts\\\n\n## Acceptance Criteria\n- [ ] \\scan(rootPath)\\ calls POST /api/graph/scan, populates nodes + relations + stats\n- [ ] \\query(text, budget)\\ calls POST /api/graph/query, populates lastQueryResult + highlightIds\n- [ ] \\\fetchStatus()\\ calls GET /api/graph/status for lightweight stats\n- [ ] \\computeReadiness()\\ derives 6 metrics from in-memory data (no API call)\n- [ ] Loading flags (scanning, querying) toggle correctly during async ops\n- [ ] Error states set and clear correctly\n- [ ] persist middleware stores lastScanTime + rootPath in localStorage\n\n## Dependencies\n- Blocked by: none\n- Blocks: F2, F3, F5\n\n## Tests Required\n- scan() success + failure paths\n- query() populates results and highlights\n- computeReadiness() with sample data\n- Loading/error state management","external_id":"I_kwDORY874M716Dar","labels":["enhancement","graph-v3"],"number":121,"title":"[Graph v3] F1: Graph Store (Zustand)","url":"https://github.com/VictorGjn/modular-patchbay/issues/121"},{"body":"## Wave 3\nAuth: API key. Fetch: GET /v1/deals. Params: pipelineId, status. toMarkdown: ## {title} + value + stage + owner. Rate: 80 req/2s.\nFiles: server/routes/connectors/pipedrive.ts","external_id":"I_kwDORY874M706WLQ","labels":["enhancement"],"number":108,"title":"Connector: Pipedrive","url":"https://github.com/VictorGjn/modular-patchbay/issues/108"},{"body":"## Wave 3\nAuth: OAuth (connected app). Fetch: SOQL via /services/data/v59.0/query. Params: query (SOQL), objectType. toMarkdown: record fields as table. Rate: 15000 calls/day.\nFiles: server/routes/connectors/salesforce.ts","external_id":"I_kwDORY874M706WJH","labels":["enhancement"],"number":107,"title":"Connector: Salesforce","url":"https://github.com/VictorGjn/modular-patchbay/issues/107"},{"body":"## Wave 3\nAuth: IMAP credentials (host, port, user, password). Fetch: IMAP SEARCH + FETCH via imapflow. Params: folder, search, limit. toMarkdown: ## {subject} + from + date + body. Works with any email provider.\nFiles: server/routes/connectors/email-imap.ts\nDep: imapflow","external_id":"I_kwDORY874M706WHQ","labels":["enhancement"],"number":106,"title":"Connector: Email (IMAP)","url":"https://github.com/VictorGjn/modular-patchbay/issues/106"},{"body":"## Wave 3\nAuth: Personal access token. Fetch: /api/1.0/projects/{id}/tasks. Params: projectId, completed. toMarkdown: ## {name} + notes + status. Rate: 1500 req/min.\nFiles: server/routes/connectors/asana.ts","external_id":"I_kwDORY874M706WBt","labels":["enhancement"],"number":103,"title":"Connector: Asana","url":"https://github.com/VictorGjn/modular-patchbay/issues/103"},{"body":"## Wave 3\nAuth: Personal access token. Fetch: /api/v4/projects/{id}/issues + /merge_requests. Params: projectId, state. toMarkdown: same as GitHub pattern. Rate: 2000 req/min.\nFiles: server/routes/connectors/gitlab.ts","external_id":"I_kwDORY874M706V-O","labels":["enhancement"],"number":102,"title":"Connector: GitLab Issues \u0026 MRs","url":"https://github.com/VictorGjn/modular-patchbay/issues/102"},{"body":"## Summary\n\nAdd ablation testing to the Qualification pipeline: measure the **quantified impact** of each knowledge source on agent performance by running eval suites with and without specific sources, then computing a delta score.\n\n## Problem\n\nRight now, Qualification (#29) answers: *\"Does this agent pass its test suite?\"*\n\nIt doesn't answer: *\"Which knowledge sources actually make it better?\"*\n\nUsers add knowledge sources (docs, guidelines, API refs, past decisions) but have no way to know:\n- Which sources are pulling their weight\n- Which are dead weight (or even hurting performance via noise)\n- Whether a source is still relevant after the model improved\n- What the marginal value of each source is in tokens-per-quality-point\n\nThis is the same problem Anthropic is tackling with their Skills 2.0 A/B testing feature (compare skill-augmented vs raw Claude). But their version is binary (skill on/off). Ours can be granular ÔÇö per-source, per-branch of the knowledge tree.\n\n## Proposed Solution\n\n### Core Concept: Ablation Runs\n\nAn **ablation run** executes the same qualification suite multiple times with controlled variations in context:\n\n| Run | Context Configuration | Purpose |\n|-----|----------------------|---------|\n| Baseline | All sources enabled | Current agent performance |\n| Ablation-N | All sources EXCEPT source N | Measure source N's contribution |\n| Minimal | No knowledge sources (system prompt only) | Floor ÔÇö what does the raw model do? |\n| Targeted | Only source N enabled | Ceiling ÔÇö source N in isolation |\n\n### Output: Source Impact Report\n\nFor each knowledge source, compute:\n\n```\nSource: \"API Reference v3\"\n  Baseline score:        87/100\n  Without this source:   71/100  \n  Delta:                 +16 points (HIGH IMPACT)\n  Token cost:            2,400 tokens\n  Efficiency:            6.7 points/1K tokens\n  \nSource: \"Old Architecture Notes\"\n  Baseline score:        87/100\n  Without this source:   88/100\n  Delta:                 -1 point (NOISE ÔÇö consider removing)\n  Token cost:            1,800 tokens\n  Efficiency:            -0.6 points/1K tokens (NEGATIVE)\n```\n\n### Key Metrics Per Source\n\n- **Delta score**: baseline - ablated score (positive = source helps)\n- **Token cost**: tokens consumed by this source in context\n- **Efficiency**: delta / token cost (points per 1K tokens)\n- **Consistency**: variance in delta across different test prompts\n- **Category**: auto-classify as HIGH IMPACT / MODERATE / LOW / NOISE / HARMFUL\n\n### Implementation Plan\n\n#### Phase 1: Core Ablation Engine\n- `server/services/ablationRunner.ts`\n  - Takes: agent config + qualification suite + list of sources\n  - Runs: baseline + one ablation per source (N+1 runs minimum)\n  - Returns: structured impact report with deltas\n- Extend `qualificationRunner.ts` to accept a `contextOverride` parameter (enable/disable specific sources)\n- Respect existing LLM-as-judge scoring ÔÇö same rubric, just different context configs\n\n#### Phase 2: UI in Qualification Tab\n- New sub-tab or toggle: \"Source Impact\" alongside existing \"Test Results\"\n- **Impact Table**: sortable by delta, efficiency, token cost\n- **Impact Visualization**: horizontal bar chart per source (green = helps, red = hurts)\n- **Recommendations**: auto-generated (\"Remove 'Old Notes' ÔÇö saves 1.8K tokens, no quality loss\")\n- One-click actions: disable source, adjust depth, archive\n\n#### Phase 3: Comparative Testing\n- Compare two agent configurations side-by-side (not just source on/off)\n- Track impact over time: re-run ablation after model updates, see if sources are still valuable\n- Alert: \"Source X was HIGH IMPACT last month, now NOISE after model update ÔÇö review needed\"\n- This maps to the Anthropic Skills 2.0 concern: skills that become obsolete as models improve\n\n#### Phase 4: Smart Defaults (stretch)\n- Auto-suggest optimal depth per source based on ablation data\n- Feed efficiency metrics into cache-aware context assembly (#30)\n- Priority ordering: highest-efficiency sources get included first when budget is tight\n\n### Integration Points\n\n- **Qualification (#29)**: extends existing infrastructure, same test suites\n- **Depth Mixer**: ablation can test different depth levels per source, not just on/off\n- **Cache-aware assembly (#30)**: efficiency metrics feed into budget allocation\n- **Auto-lessons (#31)**: if a correction consistently improves score, auto-promote to guideline\n- **Tree Indexer**: ablation at branch level, not just source level ÔÇö \"does the API auth section matter, or just the endpoint reference?\"\n\n### Cost Control\n\nAblation is expensive (N+1 runs per suite). Mitigations:\n- **Incremental mode**: only re-test sources that changed since last ablation\n- **Sampling**: run ablation on a random subset of test cases, extrapolate\n- **Caching**: if source content hasn't changed and model is same, reuse prior ablation scores\n- **Priority ordering**: test highest-token sources first (most likely to find waste)\n- **Budget cap**: user sets max LLM spend per ablation run\n\n### API Shape\n\n```typescript\n// Trigger ablation\nPOST /api/qualification/ablation\n{\n  agentId: string,\n  suiteId: string,\n  sources?: string[],       // specific sources to test (default: all)\n  modes?: ('baseline' | 'ablation' | 'minimal' | 'targeted')[],\n  sampleSize?: number,      // subset of test cases (default: all)\n  budgetCents?: number       // max spend\n}\n\n// Response\n{\n  runId: string,\n  baseline: { score: number, breakdown: ScoreBreakdown },\n  ablations: [{\n    sourceId: string,\n    sourceName: string,\n    score: number,\n    delta: number,\n    tokenCost: number,\n    efficiency: number,\n    category: 'high_impact' | 'moderate' | 'low' | 'noise' | 'harmful',\n    breakdown: ScoreBreakdown\n  }],\n  recommendations: string[],\n  totalCost: { tokens: number, estimatedUsd: number }\n}\n```\n\n## Why This Matters\n\n1. **Unique differentiator**: Nobody quantifies the value of individual context pieces. Anthropic's Skills 2.0 does binary on/off. We do per-source, per-branch, with efficiency metrics.\n2. **Exit story**: \"We can prove which knowledge makes agents better, and by how much.\" That's measurable ROI for context engineering.\n3. **User value**: Stop guessing which docs to include. Data-driven context curation.\n4. **Feeds the whole pipeline**: efficiency metrics make cache-aware assembly (#30) smarter, auto-lessons (#31) more targeted, and depth mixing more precise.\n\n## Dependencies\n\n- Qualification (#29) ÔÇö DONE\n- Tree Indexer ÔÇö DONE\n- Depth Mixer ÔÇö DONE\n\n## Priority\n\nP1 ÔÇö builds on completed infrastructure, validates the core thesis (context engineering has measurable value), strong competitive signal from Anthropic Skills 2.0.\n","external_id":"I_kwDORY874M70fhRO","labels":["enhancement"],"number":71,"title":"Context Ablation Testing ÔÇö A/B evals for knowledge sources","url":"https://github.com/VictorGjn/modular-patchbay/issues/71"},{"body":"## Vision\nReal-time visualization of the context window as it's being assembled. Show exactly what goes into the prompt, how much space each piece takes, and where the budget is being spent.\n\n## Current State\n- \\src/components/test/ContextInspector.tsx\\ ÔÇö exists but basic\n- PipelineObservabilityPanel shows stages but not token-level breakdown\n- Cache-aware assembler (#30) reorders blocks but user can't see the reordering\n\n## Implementation\n\n### 1. Token Budget Bar (\\src/components/test/TokenBudgetBar.tsx\\)\nHorizontal stacked bar showing context window usage:\n\\\\\\\n[ÔûêÔûêÔûêÔûê System ÔûêÔûêÔûêÔûê|ÔûêÔûê Knowledge ÔûêÔûê|Ôûê Memory Ôûê|ÔûêÔûêÔûê History ÔûêÔûêÔûê|ÔûæÔûæ Free ÔûæÔûæ]\n 2.1k tokens     4.3k tokens    0.8k      3.2k tokens     5.6k free\n\\\\\\\n- Color coded by block type\n- Updates in real-time during assembly\n- Click segment to expand that block\n\n### 2. Block Explorer (\\src/components/test/BlockExplorer.tsx\\)\nExpandable tree showing exactly what's in each prompt section:\n- System Frame: identity, persona, constraints, objectives, workflow\n- Knowledge: each source with depth level, token count, compression ratio\n- Memory: recalled facts with relevance scores\n- Lessons: active auto-lessons\n- Cache: green overlay on cache-eligible prefix\n- History: message pairs with token counts\n\n### 3. Live Assembly View\nDuring chat execution, show the assembly in real-time:\n- Blocks light up as they're assembled\n- Cache boundary marker visible\n- Token count animates as blocks are added\n- Final system prompt viewable (syntax highlighted markdown)\n\n### 4. Diff Between Turns\nShow what changed in the system prompt between turns:\n- Green = new content\n- Red = removed\n- Grey = unchanged (cached)\n- Helps user understand caching effectiveness\n\n## Files to Create\n- \\src/components/test/TokenBudgetBar.tsx\\\n- \\src/components/test/BlockExplorer.tsx\\\n\n## Files to Modify\n- \\src/tabs/TestTab.tsx\\ ÔÇö integrate new components in sidebar\n- \\src/components/test/ContextInspector.tsx\\ ÔÇö enhance or replace\n- \\src/services/pipelineChat.ts\\ ÔÇö emit block-level assembly events\n\n## Estimate\n**Medium (M)** ÔÇö UI visualization + pipeline event integration","external_id":"I_kwDORY874M70JOwI","labels":["enhancement","ux"],"number":70,"title":"feat: Context Inspector v2 ÔÇö live token budget visualization (#70)","url":"https://github.com/VictorGjn/modular-patchbay/issues/70"},{"body":"## Vision\nExpose teamRunner.ts in the UI. Users should be able to define agent teams, spawn sub-agents, and watch them work in parallel ÔÇö like Nimbalyst's multi-agent view.\n\n## Current State\n- \\server/services/teamRunner.ts\\ ÔÇö runs agent teams with shared facts, role prompts, parallel execution\n- \\server/services/agentRunner.ts\\ ÔÇö single agent execution with tool loop\n- No UI for team configuration or monitoring\n- Workflow steps exist in ReviewTab but are static (no execution)\n\n## Implementation\n\n### 1. Team Configuration (\\src/panels/test/TeamPanel.tsx\\)\nDefine agent teams in the Test tab:\n- Add agents to team (name, role, model override)\n- Shared task input\n- Shared facts viewer\n- Launch button\n\n### 2. Sub-Agent Monitor (\\src/components/test/SubAgentMonitor.tsx\\)\nSplit-panel view showing all running sub-agents:\n\\\\\\\nÔöîÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔö¼ÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔö¼ÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÉ\nÔöé Agent: Analyst   Ôöé Agent: Writer    Ôöé Agent: Reviewer  Ôöé\nÔöé Status: Working  Ôöé Status: Waiting  Ôöé Status: Idle     Ôöé\nÔöé Turn 3/10        Ôöé Turn 0/10        Ôöé Turn 0/5         Ôöé\nÔöé                  Ôöé                  Ôöé                   Ôöé\nÔöé ƒöì Searching... Ôöé ÔÅ│ Waiting for   Ôöé ÔÅ│ Waiting for   Ôöé\nÔöé ƒôû Reading...   Ôöé    Analyst facts Ôöé    Writer output  Ôöé\nÔöé                  Ôöé                  Ôöé                   Ôöé\nÔööÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔö┤ÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔö┤ÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÇÔöÿ\n\\\\\\\n\n### 3. Shared Fact Stream\nReal-time display of facts being extracted and shared between agents:\n- Fact badges flowing between agent columns\n- Click to expand fact details\n- Provenance tracking: which agent extracted which fact\n\n### 4. Backend: SSE for Team Progress\nCreate SSE endpoint for real-time team monitoring:\n- \\GET /api/teams/:teamId/stream\\ ÔÇö Server-Sent Events\n- Events: agent_start, agent_turn, agent_tool_call, agent_fact, agent_done, team_done\n- Wire teamRunner.ts to emit these events\n\n### 5. Team Presets\nPre-built team configurations:\n- Code Review Team (Analyst + Reviewer + Fixer)\n- Research Team (Searcher + Synthesizer + Editor)  \n- Content Team (Researcher + Writer + Proofreader)\n\n### 6. Handoff Visualization\nShow agent-to-agent handoffs:\n- Arrow indicators between agent columns\n- Shared context passed at handoff\n- Timeline view of sequential/parallel execution\n\n## Files to Create\n- \\src/panels/test/TeamPanel.tsx\\\n- \\src/components/test/SubAgentMonitor.tsx\\\n- \\src/components/test/SharedFactStream.tsx\\\n- \\src/store/teamStore.ts\\\n- \\server/routes/teams.ts\\ ÔÇö SSE endpoint\n\n## Files to Modify\n- \\src/tabs/TestTab.tsx\\ ÔÇö add Team mode toggle\n- \\server/services/teamRunner.ts\\ ÔÇö emit SSE events\n- \\server/services/agentRunner.ts\\ ÔÇö progress callbacks\n\n## Estimate\n**XL** ÔÇö New subsystem: team UI + SSE + monitor + fact stream","external_id":"I_kwDORY874M70JNyN","labels":["enhancement","ux"],"number":69,"title":"feat: Multi-agent orchestration UI ÔÇö sub-agent spawning and monitoring (#69)","url":"https://github.com/VictorGjn/modular-patchbay/issues/69"},{"body":"## Vision\nTransform the Test tab from a basic chat into a Codex/Claude Code-like agentic IDE. Show the agent THINKING and ACTING, not just responding.\n\n## Current State\n- TestPanel.tsx (1132 lines): basic chat with message bubbles\n- toolRunner.ts: has tool loop (call LLM ÔåÆ execute tools ÔåÆ re-call ÔåÆ done) but UI only shows final text\n- executionRouter.ts: dispatches to tool loop or text streaming\n- PipelineObservabilityPanel: shows pipeline stages but NOT individual tool calls in real-time\n- agentRunner.ts / teamRunner.ts: exist on backend but not exposed in UI\n\n## What Codex/Claude Code Do Right\n- Show each tool call as it happens (file read, file write, shell command)\n- Show tool results inline (collapsed by default, expandable)\n- Show thinking/reasoning steps between tool calls\n- Show progress: 'Turn 3/10 ÔÇö calling search_files...'\n- Show parallel tool calls as concurrent activities\n- Allow user to interrupt/steer mid-execution\n\n## Implementation\n\n### 1. Activity Feed Component (\\src/components/test/ActivityFeed.tsx\\)\nReal-time feed showing agent activity as it happens:\n\\\\\\\n[16:03:01] ƒöì Searching files for 'UserService'...\n[16:03:02] Ô£à Found 3 results in src/services/\n[16:03:02] ƒôû Reading src/services/userService.ts (245 lines)\n[16:03:04] ƒñö Analyzing code structure...\n[16:03:05] Ô£Å´©Å Writing fix to src/services/userService.ts\n[16:03:06] Ô£à Tool call complete (edit_file)\n[16:03:07] ƒÆ¼ Generating response...\n\\\\\\\n\n### 2. Tool Call Cards (\\src/components/test/ToolCallCard.tsx\\)\nEach tool call renders as an expandable card:\n- Header: tool name + server + status (running/success/error) + duration\n- Collapsed: one-line summary of args\n- Expanded: full args + full result (syntax highlighted for code)\n- Color coding: blue=running, green=success, red=error\n\n### 3. Turn Counter + Progress\n- Show 'Turn N/max' in header\n- Progress bar for multi-turn execution\n- Estimated time remaining based on avg turn duration\n\n### 4. Streaming Tool Events\nUpdate toolRunner.ts callbacks to emit granular events:\n\\\\\\\typescript\ninterface ToolEvent {\n  type: 'tool_start' | 'tool_result' | 'tool_error' | 'thinking' | 'turn_start' | 'turn_end';\n  toolName?: string;\n  serverName?: string;\n  args?: Record\u003cstring, unknown\u003e;\n  result?: string;\n  error?: string;\n  turnNumber?: number;\n  maxTurns?: number;\n  timestamp: number;\n}\n\\\\\\\n\n### 5. Wire into TestPanel\n- Replace simple chat bubbles for assistant messages with ActivityFeed\n- User messages stay as-is (input at bottom)\n- Each assistant 'turn' shows: thinking ÔåÆ tool calls ÔåÆ response\n- Conversation history preserved above current activity\n\n### 6. Interrupt/Steer Controls\n- 'Stop' button to cancel mid-execution\n- 'Steer' input: inject guidance during tool loop ('focus on the auth module, not tests')\n\n## Files to Create\n- \\src/components/test/ActivityFeed.tsx\\\n- \\src/components/test/ToolCallCard.tsx\\\n- \\src/components/test/TurnProgress.tsx\\\n- \\src/store/activityStore.ts\\ ÔÇö real-time activity events\n\n## Files to Modify\n- \\src/panels/TestPanel.tsx\\ ÔÇö integrate ActivityFeed\n- \\src/services/toolRunner.ts\\ ÔÇö emit granular tool events\n- \\src/services/executionRouter.ts\\ ÔÇö pass event callbacks through\n- \\src/services/pipelineChat.ts\\ ÔÇö wire activity events\n\n## Estimate\n**Large (L)** ÔÇö Major UI overhaul of Test tab + tool event system","external_id":"I_kwDORY874M70JNPa","labels":["enhancement","ux"],"number":68,"title":"feat: Agent IDE ÔÇö agentic tool loop with live activity display (#68)","url":"https://github.com/VictorGjn/modular-patchbay/issues/68"},{"body":"## Why\nNeed explicit phase between design and deployment to qualify and train an agent on mission-specific criteria.\n\n## Scope\n- Add Qualification phase in builder flow\n- Implement Qualification Suite model:\n  - mission brief\n  - test cases (nominal, edge, anti-cases)\n  - scoring rubric dimensions\n- Backend endpoints:\n  - POST /api/qualification/generate-suite\n  - POST /api/qualification/run\n  - POST /api/qualification/apply-patches\n- UI:\n  - global score + sub-scores\n  - per-test report\n  - patch suggestions with apply/re-run\n  - publish gate on threshold\n\n## Acceptance Criteria\n- Agents can be marked Qualified/Needs work based on threshold\n- Suggested patches can be applied and re-tested in one flow\n- E2E coverage for qualification cycle and publish gating","external_id":"I_kwDORY874M7vz9fS","labels":["enhancement"],"number":7,"title":"P1: Add Qualification \u0026 Training cycle to Agent Builder","url":"https://github.com/VictorGjn/modular-patchbay/issues/7"}]