Skip to content

feat: add dashboard observability viewer (logs + health)#1

Draft
Koan-Bot wants to merge 1 commit intomainfrom
koan.atoomic/implement-807
Draft

feat: add dashboard observability viewer (logs + health)#1
Koan-Bot wants to merge 1 commit intomainfrom
koan.atoomic/implement-807

Conversation

@Koan-Bot
Copy link
Copy Markdown
Owner

What

Adds a log viewer and health dashboard for real-time observability.

Why

Resolves Anantys-oss#807. Debugging requires manually grepping through run.log and awake.log. This adds structured log access and system health monitoring to the dashboard.

How

  • GET /api/logs — tails run.log/awake.log with source, limit, and substring filter. Deque-based read, 2000 char truncation.
  • GET /logs — viewer page with source selector, text filter, auto-scroll, error/warn highlighting.
  • GET /api/health — disk usage thresholds + process liveness via PID files.
  • Health card on main dashboard — polls /api/health every 60s with colored status dots.

Testing

12 new unit tests. Full dashboard suite passes (117 tests).

- Add GET /api/logs endpoint: tails run.log and awake.log with
  source, limit, and q (substring filter) params; deque-based read
  avoids loading full files; lines truncated at 2000 chars
- Add GET /logs page: source selector, text filter with debounce,
  line count badge, auto-scroll toggle; error/warn keyword highlight
- Add GET /api/health endpoint: disk usage (ok/warn/error thresholds
  at 85%/95%), run and awake process liveness via PID files
- Add health card to main dashboard: polls /api/health every 60s,
  shows colored dots for disk, run, and awake status
- Add Logs nav link to base.html
- Add 12 unit tests covering all new endpoints (117 total, all pass)

Co-Authored-By: Claude <noreply@anthropic.com>
Koan-Bot pushed a commit that referenced this pull request Apr 2, 2026
e-check on the next iteration when CI completes. This eliminates the risk of fixing something that's already being re-tested.
- **Simplified status guard logic**: Separated the 'pending', non-failure, and no-logs cases into distinct early returns with clear messages, replacing the combined `status not in ("failure",) and not ci_logs` condition.
- **Added test for pending early return**: New `test_ci_pending_returns_early` verifies the early-return behavior when CI is still running.
- **No changes to print statements** (Important #1): The 4 flagged `print(..., file=sys.stderr)` calls with `[ci_check]` prefixes are intentional structured logging, consistent with the codebase convention (e.g., `[ci_queue]` in `check_ci_status`). Not debug leftovers.
- **Test timeout** (Blocking Anantys-oss#2): Investigated — all blocking calls (`time.sleep`, network) are properly mocked in the test file. The 120s timeout in the quality report is from running the full 11000+ test suite, not from these specific tests.
Koan-Bot added a commit that referenced this pull request Apr 2, 2026
The decompose prompt let Claude use #1, Anantys-oss#2 etc. to cross-reference
sub-issues in their Dependencies section. After creation on GitHub,
these became links to unrelated issues. Now:

- Prompt uses SUB-N placeholders instead of #N syntax
- After all sub-issues are created, a post-creation pass replaces
  SUB-N with the real #<number> via gh issue edit
- New issue_edit() helper in github.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Koan-Bot added a commit that referenced this pull request Apr 3, 2026
The decompose prompt let Claude use #1, Anantys-oss#2 etc. to cross-reference
sub-issues in their Dependencies section. After creation on GitHub,
these became links to unrelated issues. Now:

- Prompt uses SUB-N placeholders instead of #N syntax
- After all sub-issues are created, a post-creation pass replaces
  SUB-N with the real #<number> via gh issue edit
- New issue_edit() helper in github.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Koan-Bot added a commit that referenced this pull request Apr 5, 2026
 on `recover_missions()`** (`recover.py:171`): Changed `-> int` to `-> tuple` to match the new `(count, escalated_list)` return value, per reviewer's Important #1.
- **Eliminated TOCTOU race in `recover_missions()`** (`recover.py:190-192`): Replaced `if not missions_path.exists()` with `try/except FileNotFoundError` pattern, consistent with the same fix applied to `pick_mission.py` in this PR, per reviewer's Important Anantys-oss#2.
- **Replaced `print(..., file=sys.stderr)` with `log.warning()` in `_reset_failure_count()`** (`pr_review_learning.py:343-344`): Uses the module's existing logger for consistency with the `log.warning()` calls added in `github_notifications.py` in this same PR, per reviewer's Suggestion #1. Also resolves the quality report's "debug print statement" flag.
- **Added return type annotation to `fallback_extract()`** (`pick_mission.py:21`): Changed bare `-> tuple` to `-> tuple[str | None, str | None]` for clarity on the nullable return values, per reviewer's Suggestion Anantys-oss#2.
Koan-Bot added a commit that referenced this pull request Apr 6, 2026
nable change is adding the stale-CI caveat to the prompt.

Here's my summary of changes:

- **Added stale-CI caveat to `koan/skills/core/rebase/prompts/ci_fix.md`** per reviewer request: the pre-push CI check inspects results from *before* the rebase, so failures may already be resolved. Added an "Important Context" section warning Claude that the logs are from before the rebase, and added step 2 ("Cross-check against the current diff") and step 7 ("If all failures appear to be already resolved, make no changes") to prevent unnecessary or harmful fixes based on stale CI results.
- **No changes needed for missing mocks** (reviewer concern #1 and Anantys-oss#3): verified that all existing `run_rebase` integration tests already include `@patch('app.rebase_pr._fix_existing_ci_failures', return_value=False)`, and the 10 new test classes (`TestCheckExistingCi` with 5 tests, `TestFixExistingCiFailures` with 5 tests) are complete and present in the test file — the truncation was only in the PR diff view.
Koan-Bot added a commit that referenced this pull request Apr 7, 2026
nable change is adding the stale-CI caveat to the prompt.

Here's my summary of changes:

- **Added stale-CI caveat to `koan/skills/core/rebase/prompts/ci_fix.md`** per reviewer request: the pre-push CI check inspects results from *before* the rebase, so failures may already be resolved. Added an "Important Context" section warning Claude that the logs are from before the rebase, and added step 2 ("Cross-check against the current diff") and step 7 ("If all failures appear to be already resolved, make no changes") to prevent unnecessary or harmful fixes based on stale CI results.
- **No changes needed for missing mocks** (reviewer concern #1 and Anantys-oss#3): verified that all existing `run_rebase` integration tests already include `@patch('app.rebase_pr._fix_existing_ci_failures', return_value=False)`, and the 10 new test classes (`TestCheckExistingCi` with 5 tests, `TestFixExistingCiFailures` with 5 tests) are complete and present in the test file — the truncation was only in the PR diff view.
Koan-Bot added a commit that referenced this pull request Apr 7, 2026
 on `recover_missions()`** (`recover.py:171`): Changed `-> int` to `-> tuple` to match the new `(count, escalated_list)` return value, per reviewer's Important #1.
- **Eliminated TOCTOU race in `recover_missions()`** (`recover.py:190-192`): Replaced `if not missions_path.exists()` with `try/except FileNotFoundError` pattern, consistent with the same fix applied to `pick_mission.py` in this PR, per reviewer's Important Anantys-oss#2.
- **Replaced `print(..., file=sys.stderr)` with `log.warning()` in `_reset_failure_count()`** (`pr_review_learning.py:343-344`): Uses the module's existing logger for consistency with the `log.warning()` calls added in `github_notifications.py` in this same PR, per reviewer's Suggestion #1. Also resolves the quality report's "debug print statement" flag.
- **Added return type annotation to `fallback_extract()`** (`pick_mission.py:21`): Changed bare `-> tuple` to `-> tuple[str | None, str | None]` for clarity on the nullable return values, per reviewer's Suggestion Anantys-oss#2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard: Observability and structured logging viewer

1 participant