Skip to content

doc(test): document per-test log paths and flaky-test debugging recipe#5609

Merged
renecannao merged 2 commits intov3.0from
v3.0-doc-test-readme
Apr 11, 2026
Merged

doc(test): document per-test log paths and flaky-test debugging recipe#5609
renecannao merged 2 commits intov3.0from
v3.0-doc-test-readme

Conversation

@renecannao
Copy link
Copy Markdown
Contributor

@renecannao renecannao commented Apr 11, 2026

Summary

Fixes two documentation gaps in test/README.md that surfaced while investigating the flaky TAP test failures reported on PR #5596.

Gap 1: per-test log paths were undocumented

The README said "Check logs in ci_infra_logs/\${INFRA_ID}/tests/" but did not document what's actually under that directory. Per-test TAP output lives three directories deep at:

ci_infra_logs/${INFRA_ID}/tests/proxysql-tester.py/tests/<test>-t.log.gz

and the files are gzipped. A maintainer opening one with cat or vim without knowing to use zless burns minutes figuring it out.

Fix: new section "Where logs actually live after a run" showing the full directory tree (per-backend, per-ProxySQL, per-test subdirs) and the concrete zless / zcat / zgrep / zdiff one-liners for reading the gzipped output.

Gap 2: no guidance for reproducing flaky tests locally

Until today there was no canonical answer to "this test passed locally and failed on CI — is it really flaky, or is it my environment?". The natural recipe (reuse one infra, loop N runs against the same running ProxySQL, snapshot per-run logs, diff) was nowhere in the repo.

Fix: new section "Debugging a flaky test" with a copy-pasteable loop that runs the same test 20 times against one infra lifecycle, stashes per-attempt per-test logs under /tmp/flake-runs/\$i/, and shows the zdiff idiom for comparing a failing attempt's TAP output against a passing one.

Other small improvements

  • Troubleshooting bullet list expanded with:
    • How to recover from a lost INFRA_ID (via docker network ls — each *_backend network is a stuck infra).
    • Where each flavor of log lives (per-backend under infra-*/, per-ProxySQL under proxysql/, per-test under tests/proxysql-tester.py/tests/).
    • How to clean up dangling docker state (docker ps -a | grep, docker network prune).

Test plan

  • test/README.md parses as valid markdown, line count went from 73 → 159 (+89 net)
  • Visual review in GitHub's rendered markdown once the PR is open

No code or workflow changes. Doc-only.

Related

Summary by CodeRabbit

  • Documentation
    • Specifies exact on-disk log locations and layout (per-backend, ProxySQL, per-test gzipped artifacts).
    • Adds concrete inspection commands for gzipped logs (zless, zcat, zgrep, zdiff).
    • Expands flaky-test troubleshooting with a reuseable infra lifecycle, per-attempt log capture/stash, and comparison workflow.
    • Clarifies detection and cleanup of stale docker networks using docker network ls and docker network rm.

Two gaps in test/README.md surfaced while investigating flaky TAP test
failures on PR #5596:

1. The README said "Check logs in ci_infra_logs/${INFRA_ID}/tests/" but
   did not document the actual layout below that — per-test TAP output
   lives three directories deep at:

     ci_infra_logs/${INFRA_ID}/tests/proxysql-tester.py/tests/<test>-t.log.gz

   The files are gzipped, which makes `cat` and most editors do the
   wrong thing (binary garbage). A maintainer opening one with vim
   without knowing to use zless burns several minutes before figuring
   it out.

2. There was no guidance on how to reproduce a flaky test locally
   for stress-testing. The typical question "this test passed twice
   locally and fails on CI - is it really flaky, or is it my
   environment?" has no canonical answer anywhere in the repo. The
   natural recipe (reuse one infra, loop N runs, snapshot per-run
   logs, diff) is described in enough detail for a maintainer to
   copy/paste it without thinking.

This commit adds:

- A new "Where logs actually live after a run" section showing the
  full ci_infra_logs/ directory tree with per-backend, per-ProxySQL,
  and per-test subdirs.
- Concrete zless/zcat/zgrep/zdiff one-liners for reading the gzipped
  per-test logs (including the cross-test grep idiom).
- A new "Debugging a flaky test" section with a copy-pasteable loop
  that runs the same test 20 times against one infra, snapshotting
  per-attempt per-test logs for later diffing.
- An expanded Troubleshooting bullet list: how to recover from lost
  INFRA_ID (via `docker network ls`), where each flavor of log lives,
  and how to clean up stale docker state.

No code or workflow changes - doc-only update.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e58ef264-02b7-4122-9569-610ca4b2d3e7

📥 Commits

Reviewing files that changed from the base of the PR and between f4313be and 0617626.

📒 Files selected for processing (1)
  • test/README.md
✅ Files skipped from review due to trivial changes (1)
  • test/README.md

📝 Walkthrough

Walkthrough

Updated test README to document exact on-disk log locations after runs, include per-backend and per-test gzipped log layout, provide zless/zcat/zgrep/zdiff inspection commands, and expand flaky-test local reproduction and Docker troubleshooting steps (specific paths and targeted cleanup).

Changes

Cohort / File(s) Summary
Documentation
test/README.md
Added exact post-run log storage location (ci_infra_logs/${INFRA_ID}/), directory tree for infra/proxysql/tests logs, concrete gzipped-log inspection commands (zless, zcat, zgrep, zdiff), step-by-step flaky-test local reproduction workflow reusing infra and stashing per-attempt logs, and updated Docker troubleshooting guidance (identify stale infra via docker network ls, remove with docker network rm).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related issues

Poem

🐰 Hop, hop, I dug the trail of logs,

gzipped crumbs in tidy cogs.
Reproduce, compare, then mend —
one infra, many tries, no end.
Cheers from a rabbit, debugging friend!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main change: documentation updates to explain per-test log paths and add a flaky-test debugging workflow.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch v3.0-doc-test-readme

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the testing documentation in test/README.md by adding a detailed guide on log locations, a workflow for debugging flaky tests locally, and enhanced troubleshooting instructions. The review feedback suggests refining the grep pattern used to identify test failures to ensure consistency with TAP output and recommending the removal of specific Docker networks instead of a global prune to avoid unintended side effects on other projects.

Comment thread test/README.md Outdated

```bash
# Which attempts had any FAIL?
grep -l 'FAIL [1-9]' /tmp/flake-*.log
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The grep pattern 'FAIL [1-9]' might not capture all failure scenarios, especially since TAP output (the standard for these tests) uses not ok to indicate a failed test. For consistency with the zgrep example provided in the "Where logs actually live" section, consider using a pattern that includes both not ok and FAIL.

Suggested change
grep -l 'FAIL [1-9]' /tmp/flake-*.log
grep -lE 'not ok|FAIL' /tmp/flake-*.log

Comment thread test/README.md Outdated
- **"Directory Not Empty"**: Run `./test/infra/control/stop-proxysql-isolated.bash` with the same `INFRA_ID` that was used when you started the infra. If you lost the ID, `docker network ls` will show you active `*_backend` networks — each one is a stuck infra; the name before `_backend` is the `INFRA_ID`.
- **Container issues**: Check logs in `ci_infra_logs/${INFRA_ID}/infra-*/` (per-backend) and `ci_infra_logs/${INFRA_ID}/proxysql/` (ProxySQL side).
- **Test failures**: Read the per-test `.log.gz` files under `ci_infra_logs/${INFRA_ID}/tests/proxysql-tester.py/tests/` with `zless` or `zcat` — see the "Where logs actually live" section above for the full layout.
- **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network prune` after stopping infras cleans up dangling networks.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

docker network prune is a global operation that removes all unused networks on the system, which might affect other Docker projects running on the same host. It is safer to suggest removing the specific network associated with the INFRA_ID first, while keeping prune as a secondary option for general cleanup.

Suggested change
- **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network prune` after stopping infras cleans up dangling networks.
- **Stale docker state**: `docker ps -a | grep "${INFRA_ID}"` shows any leftover containers; `docker network rm ${INFRA_ID}_backend` cleans up the specific network, or use `docker network prune` to clean all dangling networks.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/README.md`:
- Around line 74-92: Add the missing language tag to the fenced code block that
begins with the directory tree starting "ci_infra_logs/${INFRA_ID}/": change the
opening triple-backtick fence from ``` to ```text so the tree block is marked as
plain text and markdownlint MD040 is satisfied; update the fence that wraps the
block containing lines like "infra-mysql57/" and "proxysql-tester.py/"
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ec7f8c18-757f-4b92-880d-072c166d2a76

📥 Commits

Reviewing files that changed from the base of the PR and between e3e769f and f4313be.

📒 Files selected for processing (1)
  • test/README.md
📜 Review details
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure

Applied to files:

  • test/README.md
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.

Applied to files:

  • test/README.md
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: See `doc/agents/project-conventions.md` for ProxySQL-specific rules including directories, build, test harness, and git workflow

Applied to files:

  • test/README.md
🪛 markdownlint-cli2 (0.22.0)
test/README.md

[warning] 74-74: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
test/README.md (1)

156-156: The README wording at line 156 is accurate and requires no changes. The Docker network name is constructed as ${INFRA_ID}_backend (not as a composite with INFRA prefix). Therefore, the instruction to extract INFRA_ID by taking the text before _backend is correct.

			> Likely an incorrect or invalid review comment.

Comment thread test/README.md Outdated
…ction

Three minor corrections from the PR #5609 review:

1. (coderabbit, MD040) The directory-tree fenced block at L74 had no
   language tag. Added `text` so markdownlint and GitHub's renderer
   treat it as plain text rather than guessing a language that would
   then syntax-highlight the tree characters incorrectly.

2. (gemini, L142) The flake-debug loop's "which attempts failed" grep
   was too narrow: `grep 'FAIL [1-9]'` only matches the
   proxysql-tester.py summary line "SUMMARY: 'tests' PASS N/M : FAIL K/M"
   when K is non-zero - but misses tests that fail at the TAP level via
   "not ok" without incrementing the FAIL counter (e.g. when the binary
   crashes or times out). Broadened to `grep -lE 'not ok|FAIL [1-9]'`
   so both paths are caught, matching the idiom already used in the
   "Where logs actually live" section above.

3. (gemini, L159) The Troubleshooting bullet suggested
   `docker network prune` as the cleanup idiom for dangling networks.
   `network prune` is a GLOBAL operation - it would also wipe networks
   belonging to unrelated Docker projects on the same host, which is
   user-hostile if a maintainer has other test infras running in
   parallel. Changed to prefer the targeted form
   `docker network rm "${INFRA_ID}_backend"` and explicitly call out
   why `prune` is the wrong default.

No change to the "Where logs actually live" section, the
"Debugging a flaky test" loop itself, or any of the other content.
Doc-only update on top of the earlier commit.
@sonarqubecloud
Copy link
Copy Markdown

@renecannao renecannao merged commit ca1a7a4 into v3.0 Apr 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant