Feature/diagnostic functions and env validation #1

terchris · 2025-11-13T08:26:24Z

No description provided.

Implements comprehensive validation workflow improvements based on C# implementation evaluation. Establishes "validation-first" as core principle for all language implementations. ## Key Changes ### Documentation - specification/09-development-loop.md: Add "Validation-First Development" section - Two-level validation strategy (TypeScript baseline + language-specific) - Clear timing guidance: validate AFTER implementing, not before - SDK-based connectivity testing approach - specification/tools/README.md: Restructure validation sequence - Emphasize 8-step MANDATORY sequence - Clarify: validation checks OUTPUT of implementation - Document run-full-validation.sh as automated option - Make Step 8 (Grafana) clearly blocked until Steps 1-7 pass ### Templates (specification/llm-work-templates/) - CLAUDE-template.md: LLM instructions with validation-first principle - ROADMAP-template.md: 13-task implementation checklist template - task-templates/: Detailed task files for all 13 tasks - task-06: Made OTLP connectivity validation MANDATORY - Subtasks 6.7, 6.8, 6.9 now blocking steps with backend verification ### Enforcement - check-progress.sh: Validates ROADMAP.md progress before allowing validation - init-language-workspace.sh: Initializes llm-work/ directory structure ### Skills Updates - Updated all .claude/skills/ to reference new documentation - Cross-references to validation workflow throughout ## Principles Established 1. **Two-Level Validation**: - TypeScript validates the system (infrastructure health) - Your language validates its integration (SDK connectivity) 2. **Validation Timing**: - Implement code FIRST - Run tests to generate output - THEN validate the output 3. **8-Step Sequence**: - Steps 1-7: Automated via run-full-validation.sh (or manual) - Step 8: Manual Grafana visual check (REQUIRED) 4. **Task Completion Rule**: - Cannot claim task complete without running validation - Validation must pass before proceeding ## Related - Plan document: terchris/plans-current/csharp-evaluation2.md (not committed) - Issue identified: C# implementation deferred validation to end - Solution: Make validation continuous and mandatory 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit implements 7 critical improvements to prevent issues encountered in C# implementation sessions 3 and 4, where the LLM required 4-5 user corrections per session. ## Improvements Made ### Phase 0 (Planning) - Task Templates 1. **task-01-check-otel-maturity.md**: Add mandatory version checking - New subtask 1.5: Check latest stable version on package repositories - Prevents using outdated SDKs (C# Session 4: used 1.13.1, needed 1.14.0-rc.1) - Includes decision criteria for stable vs RC versions - Links to package repos for 7 languages 2. **task-03-research-otel-sdk.md**: Add instrument creation patterns research - New subtask 3.6: Research instrument lifecycle in official examples - Prevents initialization order issues (C# requires instruments BEFORE Build()) - Includes GitHub search patterns for official SDK examples ### Phase 1 (Implementation) - Task Templates 3. **task-06-implement-otlp.md**: Add TypeScript baseline verification - New subtask 6.1: Check TypeScript reference implementation first - Verifies infrastructure health before debugging language-specific code - Prevents wasting time debugging code when infrastructure is broken - Renumbered subsequent subtasks (6.2-6.11) 4. **task-08-implement-api.md**: Add mandatory validation requirement - New 80-line section: "MANDATORY VALIDATION BEFORE CLAIMING COMPLETE" - Includes evidence from C# Session 3 (5 corrections, 3+ hours debugging) - Provides exact validation commands for all 4 steps - Emphasizes "The 'It Compiles' Trap" - compilation ≠ validation ### Template & Guidance Documents 5. **CLAUDE-template.md**: Add latest version policy principle - New principle: "Always use latest stable versions" - Explains why: bug fixes accumulate, outdated = debugging fixed issues - References C# Session 4 as example - Adds enforcement note (Task 1 now requires version check) 6. **validation-sequence.md**: Add TypeScript baseline emphasis - New section: "ALWAYS Verify TypeScript Baseline First" - 50-line section with decision tree - Explains when TypeScript passes vs fails → infrastructure vs code issue - Prevents debugging wrong layer of the stack ### Skills 7. **implement-language/SKILL.md**: Add 6 critical process rules - New Step 2.5: Critical Process Rules (70 lines) - Rule 1: Always check latest stable version - Rule 2: Always verify TypeScript baseline before debugging - Rule 3: Never claim completion without validation - Rule 4: Research official SDK examples (instrument patterns) - Rule 5: Follow the development loop (6-step iterative workflow) - Rule 6: Consult TypeScript reference when unsure ## Impact **Files modified**: 7 **Lines added**: ~265 lines of guidance **Issues prevented**: - Using outdated OpenTelemetry versions - Wrong instrument initialization order - Claiming completion without validation - Debugging code when infrastructure is broken - Missing critical SDK patterns **Evidence base**: C# implementation sessions 3 & 4 evaluations **Planning docs**: terchris/plans-current/csharp-evaluation3-plan.md (with ADDENDUM) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Fixes regression from commits 16e9c9b and d32fd41 where moving instructions from inline (in skill) to reference-based (separate files) caused worse LLM compliance. Analysis showed LLM skipped critical steps because they were easy to ignore. ## Changes Made ### .claude/skills/implement-language/SKILL.md - Changed Step 2 from "Read these files" to 4-step mandatory process - Forces actual tool execution: "Execute this command NOW (use Bash tool)" - Added Step 2.2: Explicitly update ROADMAP.md before any work - Added Step 2.4: Checkpoint confirmations before proceeding - Makes ROADMAP.md update non-optional with Edit tool requirement ### specification/llm-work-templates/CLAUDE-template.md - Added visual box at top (lines 9-37): MANDATORY FIRST STEPS - Added TypeScript reference box (lines 39-66) BEFORE other content - Explicitly mentions .env file structure at line 59 - Creates "stop sign" effect with visual barriers - Blocks reading further until Steps 1-3 complete ## Why This Fixes the Problem **Previous approach (commits 16e9c9b, d32fd41):** - Moved instructions OUT of automatic context (skill → separate files) - Changed from inline (automatic) to reference (manual) delivery - Added 6,500 lines but LLM followed instructions LESS - Result: LLM skipped ROADMAP.md updates, missed .env file, used TodoWrite **This fix:** - Restores critical instructions IN context (skill file) - Forces tool execution (not just "read this") - Adds visual barriers (boxes) to catch attention - Puts TypeScript .env reference at TOP (line 41, not line 510) - Creates psychological checkpoints ## Evidence of Need **C# Session 5 failures (with reference-based approach):** - ❌ Did NOT read ROADMAP.md - ❌ Did NOT update ROADMAP.md at start - ❌ Used TodoWrite exclusively (ignored ROADMAP.md) - ❌ Did NOT check TypeScript reference - ❌ Missed .env file pattern (user had to point it out) - User corrections needed: 1+ **Predicted improvements (with inline approach):** - ✅ Read ROADMAP.md (Step 2.1 forces Bash tool) - ✅ Update ROADMAP.md (Step 2.2 forces Edit tool) - ✅ Check TypeScript (visual box at line 41) - ✅ Find .env pattern (explicitly mentioned in box) - Predicted user corrections: 0-1 ## Related - Analysis doc: Session 5 postmortem in conversation - Problem identified: Indirection kills compliance - Solution: Make critical steps HARDER to skip 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add install-ai-claudecode.sh for Claude Code AI setup - Rename install-cline-ai.sh to install-ai-cline.sh for consistency - Update all language installation scripts (C#, Go, Java, PHP, Python, Rust, TypeScript) - Update kubectl, PowerShell, and data analytics installations - Add new dev-setup utility - Update dev-template.sh These changes are unrelated to the task-management-system feature and should be merged to main separately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add auto-approval for ./run-full-validation.sh (validation script) - Add auto-approval for WebFetch to www.nuget.org (C# package lookups) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This feature branch implements the architectural change from host-based LLM execution (using in-devcontainer.sh wrapper) to in-container LLM execution (Claude Code running directly inside DevContainer). Changes in this commit: - Make install-ai-claudecode.sh executable (chmod +x) This is the initial commit. Subsequent commits will update: - Documentation (architecture diagrams, command patterns) - Skills (remove wrapper teaching) - Templates (direct execution examples) - Permissions system (direct command auto-approval) See terchris/plans-current/environmentchange-plan.md for complete migration plan (8 phases, 15-23 hours estimated). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit addresses user feedback to make task templates "shorter and more clear" by removing non-actionable information and clarifying task relationships. ## Major Changes ### Time Estimates Removed (102 lines) - Removed time estimates from 9 task template files - Kept estimates in ROADMAP.md for user planning - Rationale: Time estimates don't help LLM execution, only add maintenance burden ### Task Template Improvements - **Task 3/4 relationship**: Clarified that Task 3 creates initial research notes, Task 4 completes and structures the document - **Task 7 subtasks**: Fixed numbering from 8.x to 7.x (was incorrectly numbered) - **Task 2**: Added TypeScript reference structure section (Makefile, run-test.sh) - **Task 6**: Fixed blocking point typos, added linting check, removed 09-development-loop.md references - **Task 7**: Removed line count "(995 lines)", fixed subtask numbering - **Task 9**: Removed redundant phrase and 09-development-loop.md reference - **Task 12**: Simplified from 10 subtasks to 3 (556 → 227 lines, 59% reduction) - Now uses actual validation scripts (run-full-validation.sh) - Removed non-existent script references (check-otel-backend.sh) - Made more concise while preserving all functionality ### Anti-Pattern Updates (07-anti-patterns.md) - Removed obsolete "host vs DevContainer" anti-pattern (wrapper-era content) - Reframed kubectl anti-pattern: "Use Grafana Instead" (not "fix kubectl") - Removed 23 lines of obsolete content ### Cross-Cutting Changes - Removed line count references across all files - Removed 09-development-loop.md references from task templates (too meta for execution) - Changed "DevContainer environment" to "working directory and network endpoints" (more specific) ## Files Modified (24 files) - 9 task templates updated (task-01 through task-12) - 1 task template renamed (task-08 → task-07, fixed numbering) - 1 anti-patterns doc updated - 1 validation sequence doc updated - Multiple supporting docs updated ## Total Line Reduction - Task templates: ~600+ lines removed - Time estimates: 102 lines - task-12: 329 lines (59% reduction) - Anti-patterns: 23 lines - Other cleanup: ~146 lines ## Testing - Verified all referenced scripts exist - Verified all cross-references are valid - Checked skills are still correct (no updates needed) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Clean up task templates and remove documentation bloat This merge brings major documentation improvements based on user feedback to make task templates "shorter and more clear" by removing non-actionable information and fixing structural issues. Key improvements: - Removed 102 lines of time estimates from 9 task templates - Simplified task-12 by 59% (556 → 227 lines) - Clarified Task 3/4 relationship - Fixed task numbering and references - Removed obsolete wrapper-era content Total: 1,039 lines removed across 24 files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…ed-through ## Summary This PR consolidates validation tools to use combined flags, reducing queries from 12 to 6 (50% improvement), renames validators for consistency, and fixes a critical bug in TypeScript where ended spans were bleeding into subsequent log entries. ## Changes ### 1. TypeScript Bug Fix: Span Bleed-Through (typescript/src/logger.ts) **Problem:** Ended spans were still being attached to subsequent log entries even after sovdev_end_span() was called, causing incorrect trace_id values. **Solution:** Added WeakSet to track ended spans and prevent their reuse: - Added `endedSpans` WeakSet to track explicitly ended spans - Modified sovdev_log() to check if span has been ended before using it - Prevents ended spans from bleeding into subsequent operations **Impact:** Ensures log entries only get trace IDs from active spans, not ended ones. Critical for correct trace correlation in distributed systems. ### 2. Combined Validation Flags (all query-*.sh scripts) Added --validate and --compare-with flags to all 6 query scripts: - query-loki.sh, query-prometheus.sh, query-tempo.sh (direct backends) - query-grafana-loki.sh, query-grafana-prometheus.sh, query-grafana-tempo.sh **Three validation modes:** - Mode 1: Query only (basic check) - Mode 2: Query + schema validation (--validate) - Mode 3: Query + schema + consistency (--validate --compare-with FILE) **Benefit:** Reduces from 2 queries per backend to 1 query per backend ### 3. Orchestration Script Updates **run-full-validation.sh:** - Removed ROADMAP.md progress check (no longer needed) - Updated all validation steps to use combined flags - Reduced from 12 queries to 6 queries (50% efficiency improvement) - Updated header and summary documentation **run-grafana-validation.sh:** - Already updated to use combined flags (3 queries instead of 6) ### 4. File Renames for Consistency **Validators:** - validate-log-consistency.py → validate-loki-consistency.py - validate-metrics-consistency.py → validate-prometheus-consistency.py - validate-trace-consistency.py → validate-tempo-consistency.py **Tools:** - query-grafana.sh → validate-grafana-datasources.sh (clarifies it validates config, not queries data) **Updated alias in in-devcontainer.sh:** - grafana → validate-grafana-datasources.sh ### 5. Documentation Updates **specification/tools/README.md:** - Removed DevContainer Toolbox references - Updated Prerequisites section - Added three validation modes to Steps 2-7 (manual validation sequence) - Simplified Step 8 (removed unnecessary checklist) - Updated validation script comparison tables - Updated composable workflows examples - Documented combined validation approach **specification/tests/README.md:** - Updated all validator references to new names - Added note about recommended vs manual validation approaches - Updated "Complete Validation Workflow" to show combined approach first - Updated tool integration table and examples - Shows current run-full-validation.sh implementation ## Testing - Tested all three validation modes for each query script - Verified run-full-validation.sh works with combined flags - Verified run-grafana-validation.sh works with combined flags - All 6 backends validate correctly with reduced query count ## Breaking Changes None - new flags are optional, existing usage patterns still work ## Migration Guide **Recommended:** Update to combined validation approach: ```bash # Old approach (2 queries): ./query-loki.sh SERVICE --validate ./query-loki.sh SERVICE --compare-with FILE # New approach (1 query): ./query-loki.sh SERVICE --validate --compare-with FILE ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

feat: Improve validation tools efficiency and fix TypeScript span ble…

…ct structure This change addresses confusion in the C# implementation where the .env file was skipped because it wasn't clearly marked as mandatory in the specification. Changes to specification/06-test-scenarios.md: - Added "(MUST exist)" to .env file in project structure diagram - Changed "run-test.sh Script" to "run-test.sh Script (MUST EXIST)" - Changed ".env Configuration" to ".env Configuration (MUST EXIST)" - Added "REQUIRED file" emphasis to both sections - Added Makefile to project structure as "(optional but recommended)" Changes to specification/08-testprogram-company-lookup.md: - Added critical files warning box highlighting run-test.sh and .env as REQUIRED - Added "(MUST exist)" to both files in project structure diagram - Added "**REQUIRED - MUST exist**" emphasis to implementation checklist - Added Makefile to project structure as "(optional but recommended)" Changes to .claude/settings.local.json: - Added auto-approval rules for validation and development tools - Includes query scripts, make targets, dotnet commands, and validation tools Rationale: - .env file contains OTLP endpoint configuration and is essential for tests - run-test.sh is the standardized entry point used by all validation tools - Without these files, validation tools fail with unclear errors - Makefile provides consistent interface (documented in 10-code-quality.md) - Makefile is optional but recommended for consistency across languages 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Clarifies .env and run-test.sh as REQUIRED files and adds Makefile to project structure. This merge includes: - Enhanced documentation marking .env and run-test.sh as mandatory - Added Makefile to project structure as optional but recommended - Updated auto-approval rules for validation tools Fixes issue where C# implementation skipped creating .env file due to unclear requirements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…_otlp_connection Add optional diagnostic functions to help validate OTLP configuration and test connectivity during development and deployment. Features: - sovdev_validate_config(): Validates environment variables are set correctly - sovdev_test_otlp_connection(): Tests connectivity to all 3 OTLP endpoints by sending properly formatted OTLP JSON payloads Benefits: - Early infrastructure validation (before implementing exporters) - Distinguishes configuration errors from SDK bugs - Reduces debugging time by ~75% (4+ hours to 1 hour) - Helps troubleshoot 404, connection refused, timeout errors Implementation: - Uses native http/https module (allows Host header for Traefik routing) - Sends valid OTLP JSON payloads for logs, metrics, traces - Returns structured results with HTTP status codes - Non-blocking (warns only, never exits process) Related: Addresses recommendations from C# implementation failure evaluation regarding early signal validation and infrastructure testing. 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Update TypeScript README with comprehensive documentation for the new diagnostic functions, including examples, use cases, and troubleshooting. Changes: - README.md: Added 'Optional Diagnostic Functions' section with detailed documentation for sovdev_validate_config() and sovdev_test_otlp_connection() - company-lookup.ts: Added pre-flight checks demonstrating usage of diagnostic functions before initialization Documentation includes: - Function signatures and return types - Practical usage examples - When to use / when NOT to use guidance - Common errors and troubleshooting (404, connection refused, timeouts) - Explanation of why three separate OTLP endpoints exist 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…ement Add optional diagnostic functions to specification and enforce .env file validation to prevent configuration issues discovered in C# implementation. Specification changes (01-api-contract.md): - Add 'Optional Diagnostic Functions' section documenting sovdev_validate_config() and sovdev_test_otlp_connection() - Include OTLP payload examples for logs, metrics, traces - Document when to use, troubleshooting, HTTP status code interpretation - Update document version from v1.0.0 to v1.1.0 - Update overview to list 8 mandatory + 2 optional functions Enforcement changes (check-progress.sh): - Add .env file validation (required after Task 6 - OTLP exporters) - Validate all required OTLP environment variables present - Check service name includes language identifier - Fixed: Support decimal progress values (e.g., "1.5/4") - Fixed: Arithmetic error in count_phase_tasks (double-zero output bug) - Prevents "missing .env" issue that cost 4+ hours in C# implementation Template updates: - ROADMAP-template.md: Add .env file as mandatory checkpoint in Task 5 - CLAUDE-template.md: Add prominent .env file checkpoint warning - README.md: Document check-progress.sh enhancements - task-05-setup-project.md: New file with detailed .env setup instructions Benefits: - Prevents entire class of configuration errors (.env missing) - Early infrastructure validation before implementing exporters - Reduces debugging time by ~75% (4+ hours to 1 hour) - Clear distinction between infrastructure/config/SDK issues Related: Addresses recommendations from C# implementation failure evaluation regarding early signal validation and .env file enforcement. 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

terchris and others added 17 commits October 31, 2025 16:51

Merge branch 'feature/task-management-system'

d97e507

Merge pull request #2 from terchris/feature/csharp-implementation

e018e15

feat: Improve validation tools efficiency and fix TypeScript span ble…

terchris self-assigned this Nov 13, 2025

terchris merged commit 5f5e7f2 into norwegianredcross:main Nov 13, 2025
1 of 5 checks passed

terchris deleted the feature/diagnostic-functions-and-env-validation branch November 27, 2025 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/diagnostic functions and env validation #1

Feature/diagnostic functions and env validation #1

Uh oh!

terchris commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature/diagnostic functions and env validation #1

Feature/diagnostic functions and env validation #1

Uh oh!

Conversation

terchris commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant