[observability] Observability Coverage Report - 2026-03-04 #160

2026-03-04T09:06:06Z

github-actions[bot]
Bot Mar 4, 2026

Executive Summary

This daily observability report covers the last 7 days of workflow activity in the norrietaylor/tt2 repository, analyzing 30 total runs across 85 registered agentic workflows. The period analyzed spans a single active day (2026-03-04), with all runs occurring on that date. Of the 30 runs, 6 executed the full agentic agent stack (firewall + MCP gateway), 3 encountered pre-agent startup failures, 15 were intentionally skipped (event-triggered workflows with no matching criteria), and the remaining were maintenance or CI builds.

For all 6 runs that reached the agent execution stage, AWF Firewall observability achieved 100% coverage — every run produced the "Print firewall logs" step successfully and uploaded engine artifacts. MCP Gateway telemetry achieved 100% coverage for runs where the gateway was actually started (5 of 5 successful agentic runs). One agent job failure (Discussion Task Miner) caused the MCP gateway to be skipped, resulting in no MCP session telemetry for that run — classified as a warning.

Overall observability health is HEALTHY with no critical gaps in completed runs. The main concern for the week is 3 startup_failure runs that indicate pre-agent infrastructure failures, preventing any observability data from being collected for those workflows.

Key Alerts and Anomalies

🔴 Critical Issues:

None detected in runs that reached agent execution stage.

⚠️ Warnings:

Discussion Task Miner (§22662039389) — Agent job failed before MCP Gateway could start; Start MCP Gateway step was skipped, resulting in no MCP session telemetry. Firewall logs were still collected. Root cause: upstream failure in agent pre-boot configuration steps.
3 Startup Failures — The following workflows had startup_failure conclusions with 0 jobs executed, meaning no observability data is available at all for these runs:
- Architecture Diagram Generator (§22660095565)
- Agent Performance Analyzer - Meta-Orchestrator (§22659496484)
- Daily Secrets Analysis Agent (§22656576044)

ℹ️ Informational:

15 runs were intentionally skipped (PR/issue event-triggered workflows that did not match their trigger criteria) — these are N/A for observability.
2 Agentic Maintenance runs (close-expired-entities) are non-agentic housekeeping jobs; no firewall or MCP applies.

Coverage Summary

Component	Runs Executed	Logs Present	Coverage	Status
AWF Firewall (`access.log` / Print firewall logs)	6	6	100%	✅ Healthy
MCP Gateway (`gateway.jsonl` / `rpc-messages.jsonl`)	5 (gateway started)	5	100%	✅ Healthy
MCP Gateway — gateway skipped (agent failure)	1	0	0%	⚠️ Warning
Startup Failures (pre-agent)	3	0	N/A	⚠️ Warning

Note: Startup failure runs (0 jobs executed) are excluded from coverage percentages as no observability infrastructure could be initialized.

📋 Detailed Run Analysis

Firewall-Enabled Agentic Runs

Workflow	Run ID	Firewall Logs	MCP Gateway	agent-artifacts	Status
Daily Documentation Healer	§22661997681	✅	✅ Started	✅ Uploaded	✅ Healthy
Daily Team Evolution Insights	§22660417753	✅	✅ Started	✅ Uploaded	✅ Healthy
Semantic Function Refactoring	§22657939115	✅	✅ Started	✅ Uploaded	✅ Healthy
Static Analysis Report	§22657934149	✅	✅ Started	✅ Uploaded	✅ Healthy
Copilot Session Insights	§22657245187	✅	✅ Started	✅ Uploaded	✅ Healthy
Discussion Task Miner	§22662039389	✅	❌ Skipped	✅ Uploaded	⚠️ Warning

Startup Failure Runs (No Jobs Executed)

Workflow	Run ID	Date	Conclusion
Architecture Diagram Generator	§22660095565	2026-03-04 07:51	startup_failure
Agent Performance Analyzer	§22659496484	2026-03-04 07:29	startup_failure
Daily Secrets Analysis Agent	§22656576044	2026-03-04 05:33	startup_failure

Skipped / N/A Runs (15 total)

Event-triggered workflows (Plan Command, Daily Test Improver, Documentation Unbloat, Grumpy Code Reviewer, Security Review Agent) with 3 separate trigger batches (runs 22656647878–22656691144) were skipped because they did not match their activation criteria. These are expected and not observability gaps.

Artifacts Per Healthy Run

All 5 fully-successful agentic runs uploaded the following standard observability artifacts:

prompt — workflow prompt (expires 1 day)
agent-artifacts — contains mcp-logs/, sandbox/firewall/logs/ (expires 90 days)
agent_outputs — agent output files (expires 90 days)
agent-output — structured agent output JSON (expires 90 days)
safe-output — safe output manifest (expires 90 days)
threat-detection.log — threat detection scan result (expires 90 days)
safe-output-items — safe output items manifest (expires 90 days)

🔍 Telemetry Quality Analysis

Firewall Log Quality

All 6 runs that executed the agent job had the "Print firewall logs" step complete successfully. Key indicators from the current run's environment:

AWF version: v0.23.0
Firewall type: Squid proxy (squid) on 172.30.0.10:3128
AWMG version: v0.1.5
Domains allowlisted: 50+ domains including api.github.com, api.githubcopilot.com, raw.githubusercontent.com, registry.npmjs.org
IPv4 DNAT rules redirect all TCP/80 and TCP/443 to Squid on port 3128
Access logs stored at: /tmp/gh-aw/sandbox/firewall/logs/access.log
API proxy logs at: /tmp/gh-aw/sandbox/firewall/api-proxy-logs/api-proxy.log

Direct reading of access.log is restricted (permission denied from agent container), but the "Print firewall logs" step in the agent job has host-level access and reports success for all 6 runs.

MCP Gateway Log Quality

All 5 runs that started the MCP Gateway had "Parse MCP Gateway logs for step summary" succeed:

MCP servers used across runs include: safeoutputs (Safe Outputs MCP HTTP Server), GitHub MCP Server (lockdown-mode evaluated per run)
Engine output files uploaded via "Upload engine output files" step — these contain mcp-logs/ directories
AWF MCP Gateway version: v0.1.5

For Discussion Task Miner (failure), the "Parse MCP Gateway logs for step summary" step ran and succeeded, but the gateway was never started — the parse step likely found empty/no gateway logs.

Threat Detection Coverage

All 6 executed runs also ran the threat detection scan (secondary Copilot CLI invocation):

All 6 produced threat-detection.log artifacts
Results: {"prompt_injection":false,"secret_leak":false,"malicious_patch":false}
This provides an additional layer of post-execution audit capability

Healthy Runs Summary

5 of 6 executed runs achieved full observability: AWF Firewall ✅ + MCP Gateway ✅ + threat detection ✅ + all artifacts uploaded ✅.

Recommended Actions

Investigate the 3 startup_failure runs — Architecture Diagram Generator, Agent Performance Analyzer, and Daily Secrets Analysis Agent all failed before any job executed (0 total jobs). This is likely a runner provisioning or workflow configuration issue. Review the GitHub Actions runner logs for these runs directly in the GitHub UI. If these workflows are high-priority, check for quota limits, runner availability, or recent changes to .github/workflows/*.lock.yml files.
Investigate Discussion Task Miner failure — The agent job failed before the MCP Gateway could start, causing skipped execution of core agent steps. Review the full agent job log for run §22662039389 to identify the root cause. Once fixed, MCP telemetry will be restored for this workflow.
Maintain current artifact retention policy — The current 90-day retention for agent-artifacts (which contains firewall and MCP logs) is appropriate for debugging. Consider adding a dedicated firewall-logs artifact with extended retention if post-incident forensics regularly require logs older than 90 days.
Consider alerting on startup_failure — Add a monitoring rule or discussion/issue auto-creation for startup_failure conclusions to ensure these silent failures don't go unnoticed in future reports.

📊 Historical Context

This report covers only runs from 2026-03-04 (the single active day in the 7-day window). All 30 runs occurred on the same date, suggesting scheduled workflows fire on a consistent daily cadence. The 85 registered workflows represent a mature agentic workflow ecosystem. Historical trend data will be available once this report runs on multiple consecutive days.

The AWF framework version v0.23.0 and AWMG version v0.1.5 are consistently deployed across all runs, indicating a stable infrastructure baseline.

References:

§22662039389 — Discussion Task Miner (failure, ⚠️ MCP gateway skipped)
§22660095565 — Architecture Diagram Generator (startup_failure)
§22662142196 — This observability report run

Analysis window: Last 7 days | Total runs analyzed: 30 | Agent-executed runs: 6 | Date: 2026-03-04

AI generated by Daily Observability Report for AWF Firewall and MCP Gateway

expires on Mar 5, 2026, 9:06 AM UTC

2026-03-05T09:24:23Z

github-actions[bot]
Bot Mar 5, 2026
Author

This discussion was automatically closed because it expired on 2026-03-05T09:06:05.747Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[observability] Observability Coverage Report - 2026-03-04 #160

Uh oh!

{{title}}

Uh oh!

Firewall-Enabled Agentic Runs

Startup Failure Runs (No Jobs Executed)

Skipped / N/A Runs (15 total)

Artifacts Per Healthy Run

Firewall Log Quality

MCP Gateway Log Quality

Threat Detection Coverage

Healthy Runs Summary

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[observability] Observability Coverage Report - 2026-03-04 #160

Uh oh!

github-actions[bot] Bot Mar 4, 2026

Executive Summary

Key Alerts and Anomalies

Coverage Summary

Firewall-Enabled Agentic Runs

Startup Failure Runs (No Jobs Executed)

Skipped / N/A Runs (15 total)

Artifacts Per Healthy Run

Firewall Log Quality

MCP Gateway Log Quality

Threat Detection Coverage

Healthy Runs Summary

Recommended Actions

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Mar 5, 2026 Author

github-actions[bot]
Bot Mar 4, 2026

github-actions[bot]
Bot Mar 5, 2026
Author