-
Notifications
You must be signed in to change notification settings - Fork 298
fix: Prevent mock data implementations with infrastructure features #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: Prevent mock data implementations with infrastructure features #95
Conversation
Problem: The coding agent can implement in-memory storage (e.g., `dev-store.ts` with `globalThis`) instead of a real database. These implementations pass all tests because data persists during runtime, but data is lost on server restart. This is a root cause for leonvanzyl#68 - agent "passes" features that don't actually work. Solution: 1. Add 5 mandatory Infrastructure Features (indices 0-4) that run first: - Feature 0: Database connection established - Feature 1: Database schema applied correctly - Feature 2: Data persists across server restart (CRITICAL) - Feature 3: No mock data patterns in codebase - Feature 4: Backend API queries real database 2. Add STEP 5.7: Server Restart Persistence Test to coding prompt: - Create test data, stop server, restart, verify data still exists 3. Extend grep patterns for mock detection in STEP 5.6: - globalThis., devStore, dev-store, mockData, fakeData - TODO.*real, STUB, MOCK, new Map() as data stores Changes: - .claude/templates/initializer_prompt.template.md - Infrastructure features - .claude/templates/coding_prompt.template.md - STEP 5.6/5.7 enhancements - .claude/commands/create-spec.md - Phase 3b database question Backwards Compatible: - Works with YOLO mode (uses bash/grep, not browser automation) - Stateless apps can skip database features via create-spec question Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a mandatory Phase 3b persistent-data question with branching for DB-needed/unsure vs stateless, mandates five Infrastructure features (indices 0–4) as prerequisites when DB is used/uncertain, updates feature-counting and dependency rules to include infrastructure, and strengthens persistence verification including restart tests. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SpecGenerator
participant InfraPlanner
participant Verifier
participant Database
User->>SpecGenerator: Start spec creation
SpecGenerator->>User: Ask Phase 3b: "Persistent data needed?"
alt Database needed / Unsure
SpecGenerator->>InfraPlanner: Add mandatory infra features (0-4)
InfraPlanner->>Database: Provision / connect real DB
InfraPlanner->>Verifier: Mark infra as prerequisites
Verifier->>Database: Run persistence verification & restart test
Verifier-->>SpecGenerator: Persistence OK / Fail
else Stateless
SpecGenerator->>SpecGenerator: Annotate <database>none - stateless</database>
SpecGenerator->>InfraPlanner: Skip DB infra items
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.claude/templates/initializer_prompt.template.md (1)
196-219: Math error in category distribution totals.The TOTAL row (line 219) doesn't match the sum of the category rows:
Simple tier calculation:
- Infrastructure: 5
- Categories A-T: 5+15+20+10+10+10+8+5+5+5+5+8+10+8+8+8+5+5+5+5 = 160
- Actual sum: 165
- Table shows: 155 ❌ (off by 10)
Medium tier calculation:
- Infrastructure: 5
- Categories A-T: 20+25+30+20+15+20+10+10+8+10+8+12+15+10+10+10+8+8+6+5 = 260
- Actual sum: 265
- Table shows: 255 ❌ (off by 10)
Complex tier appears to sum correctly to 405+.
🔢 Corrected totals
-| **TOTAL** | **155** | **255** | **405+** | +| **TOTAL** | **165** | **265** | **405+** |Note: This change cascades to other files:
.claude/commands/create-spec.mdlines 231-233 reference these counts- Feature count examples throughout need updating
🤖 Fix all issues with AI agents
In @.claude/commands/create-spec.md:
- Around line 231-233: The feature/test counts and tier naming are inconsistent
between .claude/commands/create-spec.md (sections "Simple apps", "Medium apps",
"Advanced apps") and initializer_prompt.template.md (tiers with ~155/255/405+
tests and "Complex" naming); choose the authoritative source (or define a single
scope: "features" vs "tests"), standardize the tier name ("Advanced" or
"Complex"), and update both files so the numeric ranges and tier labels match
exactly — e.g., decide whether counts represent features or tests, adjust the
numbers in create-spec.md or initializer_prompt.template.md accordingly, and
ensure the headings and any cross-references use the same tier names.
- Around line 98-117: The spec text "Phase 3b: Database Requirements" currently
promises a "simplified infrastructure" for stateless apps but doesn't define it
and the initializer always creates exactly 5 infrastructure features (indices
0-4); fix by defining the simplified infra and adding conditional initializer
logic: explicitly state that for "<database>none - stateless
application</database>" the spec will create 0 infrastructure persistence tests
(or e.g., 2 lightweight infra checks like build and health endpoints) and show
exactly where the <database> tag sits (under the spec root/database element),
then update the initializer template to branch on the database requirement flag
(e.g., needsDatabase or the <database> value) so that when the value is "none -
stateless application" it generates the simplified feature set instead of
indices 0-4, otherwise generate the full five persistence tests.
In @.claude/templates/coding_prompt.template.md:
- Around line 226-230: After backgrounding ./init.sh and sleeping, add a
health-check to verify the server started: perform an HTTP request to the server
health endpoint (e.g., /api/health) using curl -f (or equivalent) and on failure
emit a clear error ("ERROR: Server failed to start") and exit non‑zero; make the
port configurable or replace PORT with the actual port; update the restart step
that runs ./init.sh & and sleep 15 to include this verification so the script
fails fast when the server never becomes responsive.
In @.claude/templates/initializer_prompt.template.md:
- Around line 167-176: Update the Feature 3 grep steps to match the
coding_prompt template by extending the --include flags to include ".js" in
addition to ".ts" and ".tsx", add "testData" to the mock-data pattern (so the
existing "mockData\|fakeData\|sampleData\|dummyData" becomes
"mockData\|testData\|fakeData\|sampleData\|dummyData"), add the TODO/STUB/MOCK
search pattern (e.g., "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK"), add
development-conditionals search (e.g.,
"isDevelopment\|isDev\|process\.env\.NODE_ENV.*development"), and ensure the
grep for directories that may not exist uses error suppression "2>/dev/null"
where used (apply these changes to the lines containing "globalThis\.",
"dev-store\|devStore\|DevStore\|mock-db\|mockDb",
"mockData\|fakeData\|sampleData\|dummyData", and "new Map()\|new Set()" so the
initializer_prompt.template.md mirrors coding_prompt.template.md).
- Around line 86-112: The example JSON violates the rule that all features after
index 4 must depend on infrastructure indices [0,1,2,3,4]; update each object
with index >4 (e.g., "User can register", "User can login", "User can logout",
"User can create todo", "User can view todos", "User can edit todo", "User can
delete todo", "User can filter todos", "User can search todos") to include
[0,1,2,3,4] in their depends_on_indices arrays (while preserving existing
dependencies like foundation/auth links such as 5,8,9,11,12 to maintain
transitive ordering). Ensure every entry after the INFRASTRUCTURE TIER
explicitly contains [0,1,2,3,4] so the initializer follows the
infrastructure-first requirement.
🧹 Nitpick comments (3)
.claude/templates/coding_prompt.template.md (2)
219-224: Consider more robust server shutdown verification.The current approach has potential timing issues:
- Line 221:
sleep 5may not be sufficient for all server processes to fully terminate, especially under load.- Line 223: The verification
pgrep -f "node" && echo "ERROR..." && exit 1will cause the entire verification to fail if used in a script withset -e, even when successful.♻️ More robust shutdown verification
3. **STOP the server completely:** ```bash - pkill -f "node" || pkill -f "npm" || pkill -f "next" - sleep 5 + # Send SIGTERM first, then SIGKILL if needed + pkill -f "node" || pkill -f "npm" || pkill -f "next" + sleep 3 + pkill -9 -f "node" 2>/dev/null || true + sleep 2 # Verify server is stopped - pgrep -f "node" && echo "ERROR: Server still running!" && exit 1 + if pgrep -f "node" > /dev/null; then + echo "ERROR: Server still running!" + exit 1 + fi ```
236-239: Avoid redundant grep commands - reference STEP 5.6.Lines 237-238 repeat the grep patterns already defined in STEP 5.6. This creates maintenance burden if patterns need to be updated.
♻️ Reference existing step instead
6. **If data is GONE:** Implementation uses in-memory storage → CRITICAL FAIL - - Search for: `grep -r "globalThis\|devStore\|dev-store" src/` - - You MUST fix the mock data implementation before proceeding + - Run all grep commands from STEP 5.6 to identify the mock pattern + - You MUST fix the in-memory storage implementation before proceeding - Replace in-memory storage with real database queries.claude/templates/initializer_prompt.template.md (1)
289-297: Add package.json verification for mock backend libraries.Lines 291-293 list prohibited mock backend tools (
json-server,mirage,msw), but these are npm packages, not code patterns. The grep commands in Feature 3 (lines 170-173) won't catch these.📦 Add package.json check to Feature 3
In the test steps for Feature 3 (lines 167-176), add a package.json check:
Steps: 1. Run: grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" src/ 2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" src/ 3. Run: grep -r "mockData\|fakeData\|sampleData\|dummyData" --include="*.ts" src/ 4. Run: grep -r "new Map()\|new Set()" --include="*.ts" src/lib/ src/store/ src/data/ -5. ALL grep commands must return empty (exit code 1) -6. If any returns results → investigate and fix before passing +5. Check package.json for mock backend libraries: + grep -E "json-server|miragejs|msw" package.json +6. ALL checks must return empty (exit code 1) +7. If any returns results → investigate and fix before passingThis catches mock backends installed as dependencies.
| ### Phase 3b: Database Requirements (MANDATORY) | ||
|
|
||
| **Always ask this question regardless of mode:** | ||
|
|
||
| > "One foundational question about data storage: | ||
| > | ||
| > **Does this application need to store user data persistently?** | ||
| > | ||
| > 1. **Yes, needs a database** - Users create, save, and retrieve data (most apps) | ||
| > 2. **No, stateless** - Pure frontend, no data storage needed (calculators, static sites) | ||
| > 3. **Not sure** - Let me describe what I need and you decide" | ||
| **Branching logic:** | ||
|
|
||
| - **If "Yes" or "Not sure"**: Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence. | ||
|
|
||
| - **If "No, stateless"**: Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly: | ||
| ```xml | ||
| <database>none - stateless application</database> | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# List files in .claude directory structure
find .claude -type f -name "*.md" | head -20Repository: leonvanzyl/autocoder
Length of output: 607
🏁 Script executed:
#!/bin/bash
# Check if the mentioned files exist and get their sizes
ls -lh .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md 2>&1Repository: leonvanzyl/autocoder
Length of output: 240
🏁 Script executed:
#!/bin/bash
# Examine the context around lines 98-117 in create-spec.md
sed -n '95,120p' .claude/commands/create-spec.mdRepository: leonvanzyl/autocoder
Length of output: 1151
🏁 Script executed:
#!/bin/bash
# Search for stateless/conditional logic in initializer_prompt.template.md
rg -n "stateless|no database|skip infrastructure|Phase 3b" .claude/templates/initializer_prompt.template.md -C 3Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
#!/bin/bash
# Read initializer_prompt.template.md to see what it contains
wc -l .claude/templates/initializer_prompt.template.mdRepository: leonvanzyl/autocoder
Length of output: 116
🏁 Script executed:
#!/bin/bash
# Read initializer_prompt.template.md contents
cat -n .claude/templates/initializer_prompt.template.mdRepository: leonvanzyl/autocoder
Length of output: 19205
🏁 Script executed:
#!/bin/bash
# Search for infrastructure references in both files
rg -n "infrastructure|Infrastructure" .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md -C 2Repository: leonvanzyl/autocoder
Length of output: 8294
🏁 Script executed:
#!/bin/bash
# Search broadly for stateless in the entire .claude directory
rg -r "stateless" .claude/Repository: leonvanzyl/autocoder
Length of output: 1267
Reconcile stateless app handling between spec creation and initializer implementation.
The spec creation process mentions "simplified infrastructure" for stateless apps, but the initializer template doesn't implement this scenario:
-
Undefined simplification (line 114): States infrastructure will be "simplified (no database persistence tests)" for stateless apps, but doesn't specify:
- How many infrastructure features should be created (0? Fewer than 5?)
- What "simplified" features look like
- How this affects overall feature count
-
Unclear XML tag placement (lines 115-117): The XML tag
<database>none - stateless application</database>is shown but its location in the spec structure is not defined. -
Missing initializer implementation: The initializer template (lines 39-41, 119-131, 225) has no conditional logic for stateless apps. It always mandates exactly 5 infrastructure features (indices 0-4) for all complexity tiers, regardless of whether the app needs a database. This creates an inconsistency: the spec process promises simplified infrastructure for stateless apps, but the initializer can't deliver it.
Resolve by either:
- Defining what "simplified infrastructure" means (and updating the initializer to handle it), or
- Clarifying that stateless apps still require the 5 infrastructure features but with different test steps
🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 98 - 117, The spec text "Phase
3b: Database Requirements" currently promises a "simplified infrastructure" for
stateless apps but doesn't define it and the initializer always creates exactly
5 infrastructure features (indices 0-4); fix by defining the simplified infra
and adding conditional initializer logic: explicitly state that for
"<database>none - stateless application</database>" the spec will create 0
infrastructure persistence tests (or e.g., 2 lightweight infra checks like build
and health endpoints) and show exactly where the <database> tag sits (under the
spec root/database element), then update the initializer template to branch on
the database requirement flag (e.g., needsDatabase or the <database> value) so
that when the value is "none - stateless application" it generates the
simplified feature set instead of indices 0-4, otherwise generate the full five
persistence tests.
| - **Simple apps** (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure) | ||
| - **Medium apps** (blog, task manager with auth): ~105 features (includes 5 infrastructure) | ||
| - **Advanced apps** (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Extract feature count references from both files
echo "=== create-spec.md feature counts ==="
rg "Simple apps|Medium apps|Advanced apps|Complex apps" .claude/commands/create-spec.md -B 1 -A 1
echo ""
echo "=== initializer_prompt.template.md feature counts ==="
rg "Simple|Medium|Complex|Advanced" .claude/templates/initializer_prompt.template.md | grep -E "apps|features|tests|Simple|Medium|Complex|Advanced" | head -30Repository: leonvanzyl/autocoder
Length of output: 735
Feature count inconsistency between create-spec.md and initializer_prompt.template.md.
The feature/test counts for corresponding app tiers don't align:
| Tier | create-spec.md | initializer_prompt.template.md |
|---|---|---|
| Simple | ~25-55 features | ~155 tests |
| Medium | ~105 features | ~255 tests |
| Advanced/Complex | ~155-205 features | ~405+ tests |
Additionally, the tiers use different naming ("Advanced" vs "Complex"), making it unclear whether they represent the same tier. The magnitude differences suggest these may be counting different things or based on different scope definitions. Clarify which counts are authoritative and align both templates accordingly.
🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 231 - 233, The feature/test
counts and tier naming are inconsistent between .claude/commands/create-spec.md
(sections "Simple apps", "Medium apps", "Advanced apps") and
initializer_prompt.template.md (tiers with ~155/255/405+ tests and "Complex"
naming); choose the authoritative source (or define a single scope: "features"
vs "tests"), standardize the tier name ("Advanced" or "Complex"), and update
both files so the numeric ranges and tier labels match exactly — e.g., decide
whether counts represent features or tests, adjust the numbers in create-spec.md
or initializer_prompt.template.md accordingly, and ensure the headings and any
cross-references use the same tier names.
- Fix math error in category totals (155→165, 255→265) - Fix example JSON to include [0,1,2,3,4] dependencies for all features - Add more robust server shutdown (SIGTERM then SIGKILL) - Add health check after server restart - Align grep patterns between templates (add .js, testData, TODO/STUB/MOCK) - Add package.json check for mock backend libraries - Reference STEP 5.6 instead of duplicating grep commands Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Line 197: The grep pattern uses literal parentheses which are interpreted by
the shell/regex engine, so update the command that searches for "new Map()" and
"new Set()" to escape the parentheses (e.g., escape both '(' and ')' in the
patterns) so the literal strings "new Map()" and "new Set()" are matched; target
the grep invocation shown (the command containing new Map() and new Set() in the
template) and modify the patterns accordingly, or alternatively switch to an
extended regex flag and use escaped or bracketed forms to reliably match the
parentheses.
♻️ Duplicate comments (2)
.claude/templates/initializer_prompt.template.md (1)
171-175: CRITICAL: Incomplete file extension coverage in grep patterns.Lines 171-175 are missing
--include="*.js"in their grep commands. Only line 170 includes JavaScript files, but the remaining patterns (dev-store, mockData, TODO/STUB/MOCK, isDevelopment, Map/Set) only search.tsand.tsxfiles.Impact: If the project contains
.jsfiles with mock data patterns (e.g.,devStorein a.jsfile), they will not be detected, allowing mock implementations to slip through.🐛 Add .js extension to all grep patterns
-2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" src/ -3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" src/ -4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" src/ -5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" src/ -6. Run: grep -r "new Map()\|new Set()" --include="*.ts" --include="*.tsx" src/lib/ src/store/ src/data/ 2>/dev/null +2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" src/ +3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" --include="*.js" src/ +4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" src/ +5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" src/ +6. Run: grep -r "new Map()\|new Set()" --include="*.ts" --include="*.tsx" --include="*.js" src/lib/ src/store/ src/data/ 2>/dev/null.claude/templates/coding_prompt.template.md (1)
232-238: Health check doesn't fail on server startup failure.The restart verification uses
|| echo "WARNING"which prevents the script from exiting when the server fails to start. This means the test could incorrectly report data loss when the server never actually restarted.🔧 Proposed fix
4. **RESTART the server:** ```bash ./init.sh & sleep 15 # Allow server to fully start # Verify server is responding - curl -f http://localhost:3000/api/health || curl -f http://localhost:3000 || echo "WARNING: Health check failed" + if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; then + echo "ERROR: Server failed to start after restart" + exit 1 + fi ```This ensures the test fails fast with a clear error when the server doesn't respond, rather than proceeding to incorrectly report that data was lost.
🧹 Nitpick comments (2)
.claude/templates/initializer_prompt.template.md (1)
136-189: Consider adding language specifiers to fenced code blocks.The infrastructure feature descriptions use fenced code blocks without language specifiers (lines 136, 145, 154, 168, 182). While this doesn't affect functionality, adding language identifiers improves syntax highlighting and follows markdown best practices.
♻️ Add language specifiers
For the test steps, you could use
textorbash(for grep commands):**Feature 0 - Database connection established:** -``` +```text Steps: 1. Start the development server ...Feature 1 - Database schema applied correctly:
-+text
Steps:
...**Feature 2 - Data persists across server restart (CRITICAL):** -``` +```bash Steps: ...Feature 3 - No mock data patterns in codebase:
-+bash
Steps:
...**Feature 4 - Backend API queries real database:** -``` +```text Steps: ...</details> Based on static analysis hints. </blockquote></details> <details> <summary>.claude/templates/coding_prompt.template.md (1)</summary><blockquote> `220-221`: **Consider more specific process matching.** The pattern `pkill -f "node"` matches any process with "node" in its command line, which could unintentionally kill other development servers or node processes running on the system. <details> <summary>💡 More specific alternatives</summary> If the project uses specific npm scripts or has a known structure, you could make the pattern more specific: ```bash # Option 1: Match the specific server script pkill -f "node.*init.sh\|next dev\|npm run dev" || pkill -f "npm" || true # Option 2: Kill by port if known lsof -ti:3000 | xargs kill -TERM 2>/dev/null || true sleep 3 lsof -ti:3000 | xargs kill -9 2>/dev/null || trueHowever, since this is a generic template that needs to work across different project structures, the current broad pattern may be acceptable as a tradeoff for simplicity.
- Escape parentheses in grep patterns: new Map\(\) and new Set\(\) - Add --include="*.js" to all grep commands for complete coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed from warning-only to proper error handling: - if server doesn't respond after restart, exit with error - prevents false negatives when server fails to start Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Line 237: The health check is hardcoded to port 3000 ("curl -f
http://localhost:3000/api/health" and "http://localhost:3000") but elsewhere the
template uses a PORT placeholder, causing false failures; update the curl checks
to use the PORT variable (or detect the port from init.sh/env) and ensure the
same PORT variable is used at the later line that currently references PORT so
both checks are consistent; target the two curl invocations in the template (the
lines containing "curl -f http://localhost:3000/api/health" and "curl -f
http://localhost:3000") and replace them with a configurable $PORT or a
detection mechanism.
- Line 180: The grep lines in .claude/templates/coding_prompt.template.md claim
to exclude test files but do not actually include exclusion patterns; update
each of the five grep commands to add explicit exclusions such as
--exclude='*.test.*' --exclude='*.spec.*' and --exclude-dir='__tests__' (or
equivalent for your grep implementation) so src/ test files are skipped; ensure
you modify the exact grep command occurrences in the template (the five grep
commands) to include these flags consistently.
🧹 Nitpick comments (1)
.claude/templates/coding_prompt.template.md (1)
197-197: Inconsistent directory scope for Map/Set detection.This grep only searches
src/lib/ src/store/ src/data/, while all other patterns search the entiresrc/directory. In-memory Map/Set usage in othersrc/subdirectories (e.g.,src/components/,src/api/) would be missed.♻️ Align directory scope with other checks
If the intent is to catch Map/Set anywhere in production code:
-grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/lib/ src/store/ src/data/ 2>/dev/null +grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/ 2>/dev/nullIf limiting to specific directories is intentional, add a comment explaining why.
| 2. **Runtime test:** Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone | ||
| 3. **Check database:** All displayed data must come from real DB queries | ||
| 4. If unexplained data appears, it's mock data - fix before marking passing. | ||
| **Run ALL these grep checks. Any hits in src/ (excluding test files) require investigation:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify or implement test file exclusion.
The comment states "excluding test files," but none of the grep commands below implement exclusion patterns (e.g., --exclude='*.test.ts' --exclude='*.spec.ts'). Without explicit exclusion, agents will encounter false positives from test files that legitimately use mock data.
♻️ Add test file exclusion to grep commands
-**Run ALL these grep checks. Any hits in src/ (excluding test files) require investigation:**
+**Run ALL these grep checks. Any hits in src/ require investigation (legitimate test file usage is OK):**Or implement the exclusion:
# 1. In-memory storage patterns (CRITICAL - catches dev-store)
-grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" src/
+grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" --exclude="*.test.ts" --exclude="*.spec.ts" --exclude="*.test.tsx" src/Apply similar exclusions to all five grep commands.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Run ALL these grep checks. Any hits in src/ (excluding test files) require investigation:** | |
| **Run ALL these grep checks. Any hits in src/ require investigation (legitimate test file usage is OK):** |
🤖 Prompt for AI Agents
In @.claude/templates/coding_prompt.template.md at line 180, The grep lines in
.claude/templates/coding_prompt.template.md claim to exclude test files but do
not actually include exclusion patterns; update each of the five grep commands
to add explicit exclusions such as --exclude='*.test.*' --exclude='*.spec.*' and
--exclude-dir='__tests__' (or equivalent for your grep implementation) so src/
test files are skipped; ensure you modify the exact grep command occurrences in
the template (the five grep commands) to include these flags consistently.
| ./init.sh & | ||
| sleep 15 # Allow server to fully start | ||
| # Verify server is responding | ||
| if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove hardcoded port or add configuration guidance.
The health check hardcodes port 3000, but line 245 uses a PORT placeholder, indicating awareness that apps may run on different ports. If the application doesn't run on port 3000, this health check will incorrectly fail, causing the restart test to abort even when the server started successfully.
♻️ Make port configurable or detect it
Option 1: Detect port from init.sh or environment
+ # Detect server port (check init.sh, .env, or package.json for PORT)
+ APP_PORT=${PORT:-3000}
# Verify server is responding
- if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; then
+ if ! curl -f http://localhost:$APP_PORT/api/health && ! curl -f http://localhost:$APP_PORT; then
echo "ERROR: Server failed to start after restart"
exit 1
fiOption 2: Add instruction to configure port
+ # IMPORTANT: Replace 3000 with your app's actual port
# Verify server is responding
if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; thenAlso update line 245 to use the same port or variable.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; then | |
| # Detect server port (check init.sh, .env, or package.json for PORT) | |
| APP_PORT=${PORT:-3000} | |
| # Verify server is responding | |
| if ! curl -f http://localhost:$APP_PORT/api/health && ! curl -f http://localhost:$APP_PORT; then | |
| echo "ERROR: Server failed to start after restart" | |
| exit 1 | |
| fi |
🤖 Prompt for AI Agents
In @.claude/templates/coding_prompt.template.md at line 237, The health check is
hardcoded to port 3000 ("curl -f http://localhost:3000/api/health" and
"http://localhost:3000") but elsewhere the template uses a PORT placeholder,
causing false failures; update the curl checks to use the PORT variable (or
detect the port from init.sh/env) and ensure the same PORT variable is used at
the later line that currently references PORT so both checks are consistent;
target the two curl invocations in the template (the lines containing "curl -f
http://localhost:3000/api/health" and "curl -f http://localhost:3000") and
replace them with a configurable $PORT or a detection mechanism.
Problem
The coding agent can implement in-memory storage (e.g.,
dev-store.tswithglobalThis) instead of a real database. These implementations pass all tests because data persists during runtime, but data is lost on server restart.This is a root cause for #68 - agent "passes" features that don't actually work.
Solution
1. Mandatory Infrastructure Features (indices 0-4)
All projects with a database now include 5 infrastructure features that run FIRST:
All other features depend on indices [0,1,2,3,4], ensuring infrastructure is verified first.
2. STEP 5.7: Server Restart Persistence Test
Added to
coding_prompt.template.md:pkill -f "node")3. Extended Mock Data Detection (STEP 5.6)
Enhanced grep patterns to catch:
globalThis.(in-memory storage)dev-store,devStore,mock-db,mockDbmockData,fakeData,sampleData,dummyDataTODO.*real,STUB,MOCKnew Map()/new Set()as data storesFiles Changed
.claude/templates/initializer_prompt.template.md- Infrastructure features definition.claude/templates/coding_prompt.template.md- STEP 5.6/5.7 enhancements.claude/commands/create-spec.md- Phase 3b database questionBackwards Compatible
Why This Matters
In-memory stores like
globalThis.devStorepass all simple tests because data persists during a single server run. Only a full server restart reveals this bug. Without infrastructure features, the agent can "pass" features with fundamentally broken implementations.Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.