doc(test): document per-test log paths and flaky-test debugging recipe#5609
doc(test): document per-test log paths and flaky-test debugging recipe#5609renecannao merged 2 commits intov3.0from
Conversation
Two gaps in test/README.md surfaced while investigating flaky TAP test failures on PR #5596: 1. The README said "Check logs in ci_infra_logs/${INFRA_ID}/tests/" but did not document the actual layout below that — per-test TAP output lives three directories deep at: ci_infra_logs/${INFRA_ID}/tests/proxysql-tester.py/tests/<test>-t.log.gz The files are gzipped, which makes `cat` and most editors do the wrong thing (binary garbage). A maintainer opening one with vim without knowing to use zless burns several minutes before figuring it out. 2. There was no guidance on how to reproduce a flaky test locally for stress-testing. The typical question "this test passed twice locally and fails on CI - is it really flaky, or is it my environment?" has no canonical answer anywhere in the repo. The natural recipe (reuse one infra, loop N runs, snapshot per-run logs, diff) is described in enough detail for a maintainer to copy/paste it without thinking. This commit adds: - A new "Where logs actually live after a run" section showing the full ci_infra_logs/ directory tree with per-backend, per-ProxySQL, and per-test subdirs. - Concrete zless/zcat/zgrep/zdiff one-liners for reading the gzipped per-test logs (including the cross-test grep idiom). - A new "Debugging a flaky test" section with a copy-pasteable loop that runs the same test 20 times against one infra, snapshotting per-attempt per-test logs for later diffing. - An expanded Troubleshooting bullet list: how to recover from lost INFRA_ID (via `docker network ls`), where each flavor of log lives, and how to clean up stale docker state. No code or workflow changes - doc-only update.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughUpdated test README to document exact on-disk log locations after runs, include per-backend and per-test gzipped log layout, provide zless/zcat/zgrep/zdiff inspection commands, and expand flaky-test local reproduction and Docker troubleshooting steps (specific paths and targeted cleanup). Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request significantly improves the testing documentation in test/README.md by adding a detailed guide on log locations, a workflow for debugging flaky tests locally, and enhanced troubleshooting instructions. The review feedback suggests refining the grep pattern used to identify test failures to ensure consistency with TAP output and recommending the removal of specific Docker networks instead of a global prune to avoid unintended side effects on other projects.
|
|
||
| ```bash | ||
| # Which attempts had any FAIL? | ||
| grep -l 'FAIL [1-9]' /tmp/flake-*.log |
There was a problem hiding this comment.
The grep pattern 'FAIL [1-9]' might not capture all failure scenarios, especially since TAP output (the standard for these tests) uses not ok to indicate a failed test. For consistency with the zgrep example provided in the "Where logs actually live" section, consider using a pattern that includes both not ok and FAIL.
| grep -l 'FAIL [1-9]' /tmp/flake-*.log | |
| grep -lE 'not ok|FAIL' /tmp/flake-*.log |
| - **"Directory Not Empty"**: Run `./test/infra/control/stop-proxysql-isolated.bash` with the same `INFRA_ID` that was used when you started the infra. If you lost the ID, `docker network ls` will show you active `*_backend` networks — each one is a stuck infra; the name before `_backend` is the `INFRA_ID`. | ||
| - **Container issues**: Check logs in `ci_infra_logs/${INFRA_ID}/infra-*/` (per-backend) and `ci_infra_logs/${INFRA_ID}/proxysql/` (ProxySQL side). | ||
| - **Test failures**: Read the per-test `.log.gz` files under `ci_infra_logs/${INFRA_ID}/tests/proxysql-tester.py/tests/` with `zless` or `zcat` — see the "Where logs actually live" section above for the full layout. | ||
| - **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network prune` after stopping infras cleans up dangling networks. |
There was a problem hiding this comment.
docker network prune is a global operation that removes all unused networks on the system, which might affect other Docker projects running on the same host. It is safer to suggest removing the specific network associated with the INFRA_ID first, while keeping prune as a secondary option for general cleanup.
| - **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network prune` after stopping infras cleans up dangling networks. | |
| - **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network rm ${INFRA_ID}_backend` cleans up the specific network, or use `docker network prune` to clean all dangling networks. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/README.md`:
- Around line 74-92: Add the missing language tag to the fenced code block that
begins with the directory tree starting "ci_infra_logs/${INFRA_ID}/": change the
opening triple-backtick fence from ``` to ```text so the tree block is marked as
plain text and markdownlint MD040 is satisfied; update the fence that wraps the
block containing lines like "infra-mysql57/" and "proxysql-tester.py/"
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ec7f8c18-757f-4b92-880d-072c166d2a76
📒 Files selected for processing (1)
test/README.md
📜 Review details
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure
Applied to files:
test/README.md
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Applied to files:
test/README.md
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: See `doc/agents/project-conventions.md` for ProxySQL-specific rules including directories, build, test harness, and git workflow
Applied to files:
test/README.md
🪛 markdownlint-cli2 (0.22.0)
test/README.md
[warning] 74-74: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (1)
test/README.md (1)
156-156: The README wording at line 156 is accurate and requires no changes. The Docker network name is constructed as${INFRA_ID}_backend(not as a composite with INFRA prefix). Therefore, the instruction to extract INFRA_ID by taking the text before_backendis correct.> Likely an incorrect or invalid review comment.
…ction Three minor corrections from the PR #5609 review: 1. (coderabbit, MD040) The directory-tree fenced block at L74 had no language tag. Added `text` so markdownlint and GitHub's renderer treat it as plain text rather than guessing a language that would then syntax-highlight the tree characters incorrectly. 2. (gemini, L142) The flake-debug loop's "which attempts failed" grep was too narrow: `grep 'FAIL [1-9]'` only matches the proxysql-tester.py summary line "SUMMARY: 'tests' PASS N/M : FAIL K/M" when K is non-zero - but misses tests that fail at the TAP level via "not ok" without incrementing the FAIL counter (e.g. when the binary crashes or times out). Broadened to `grep -lE 'not ok|FAIL [1-9]'` so both paths are caught, matching the idiom already used in the "Where logs actually live" section above. 3. (gemini, L159) The Troubleshooting bullet suggested `docker network prune` as the cleanup idiom for dangling networks. `network prune` is a GLOBAL operation - it would also wipe networks belonging to unrelated Docker projects on the same host, which is user-hostile if a maintainer has other test infras running in parallel. Changed to prefer the targeted form `docker network rm "${INFRA_ID}_backend"` and explicitly call out why `prune` is the wrong default. No change to the "Where logs actually live" section, the "Debugging a flaky test" loop itself, or any of the other content. Doc-only update on top of the earlier commit.
|



Summary
Fixes two documentation gaps in
test/README.mdthat surfaced while investigating the flaky TAP test failures reported on PR #5596.Gap 1: per-test log paths were undocumented
The README said "Check logs in
ci_infra_logs/\${INFRA_ID}/tests/" but did not document what's actually under that directory. Per-test TAP output lives three directories deep at:and the files are gzipped. A maintainer opening one with
catorvimwithout knowing to usezlessburns minutes figuring it out.Fix: new section "Where logs actually live after a run" showing the full directory tree (per-backend, per-ProxySQL, per-test subdirs) and the concrete
zless/zcat/zgrep/zdiffone-liners for reading the gzipped output.Gap 2: no guidance for reproducing flaky tests locally
Until today there was no canonical answer to "this test passed locally and failed on CI — is it really flaky, or is it my environment?". The natural recipe (reuse one infra, loop N runs against the same running ProxySQL, snapshot per-run logs, diff) was nowhere in the repo.
Fix: new section "Debugging a flaky test" with a copy-pasteable loop that runs the same test 20 times against one infra lifecycle, stashes per-attempt per-test logs under
/tmp/flake-runs/\$i/, and shows thezdiffidiom for comparing a failing attempt's TAP output against a passing one.Other small improvements
INFRA_ID(viadocker network ls— each*_backendnetwork is a stuck infra).infra-*/, per-ProxySQL underproxysql/, per-test undertests/proxysql-tester.py/tests/).docker ps -a | grep,docker network prune).Test plan
test/README.mdparses as valid markdown, line count went from 73 → 159 (+89 net)No code or workflow changes. Doc-only.
Related
.log.gzin CI-uploaded artifacts (PR ci: fix upload-artifact EACCES in all downstream test reusables #5608 fix), or (b) reproduce locally following this doc's new recipe.Summary by CodeRabbit