ci: rewire legacy-g{1,3,5} and mysql84-g{1..5} callers to dedicated reusables, add CI-unittests, document CI architecture#5598
Conversation
…cument CI architecture Rewiring -------- Following the landing of #5597 on the GH-Actions branch (which adds the nine dedicated reusable workflows), point each of the eight existing group callers at its dedicated reusable instead of the broken generic ci-taptests-groups.yml: CI-legacy-g1.yml → ci-legacy-g1.yml@GH-Actions CI-legacy-g3.yml → ci-legacy-g3.yml@GH-Actions CI-legacy-g5.yml → ci-legacy-g5.yml@GH-Actions CI-mysql84-g1.yml → ci-mysql84-g1.yml@GH-Actions CI-mysql84-g2.yml → ci-mysql84-g2.yml@GH-Actions CI-mysql84-g3.yml → ci-mysql84-g3.yml@GH-Actions CI-mysql84-g4.yml → ci-mysql84-g4.yml@GH-Actions CI-mysql84-g5.yml → ci-mysql84-g5.yml@GH-Actions Each edit also drops the now-useless `testgroup: '["<group>"]'` input that the new dedicated reusables do not accept (and the old generic reusable silently ignored). This is the other half of the fix started in #5597. Before this change, the eight group workflows were routed to ci-taptests-groups.yml, whose tests job strategy evaluates from a cached proxysql/tap-matrix.json that CI-builds has been populating as "[ ]". With an empty matrix vector GitHub Actions rejects the strategy with "Matrix vector 'testgroup' does not contain any values" and the tests job never instantiates -- visible for example in run 24240099115. With this PR merged (after #5597), these eight workflows will route to their own dedicated reusables and start running real tests on the next push. CI-unittests caller ------------------- Adds a new v3.0 caller CI-unittests.yml targeting ci-unittests.yml@GH-Actions (also added in #5597). This runs the 42 unit-tests-g1 binaries listed in test/tap/groups/groups.json directly on the runner, via the SKIP_PROXYSQL=1 host-only branch in run-tests-isolated.bash -- no Docker, no backends. Documentation ------------- Replaces doc/GH-Actions/README.md with an expanded, current CI architecture reference (~800 lines). Covers: * The two-branch caller/reusable split and the workflow_run default-branch rule that forces it * The CI-trigger / CI-builds babysitter pattern and why test workflows chain on CI-trigger completion (and not on CI-builds directly -- see 78b8f5a for the incident that taught us why) * The dedicated-reusable pattern, with ci-legacy-g4.yml walked through step by step as the canonical template * Build cache layout (_bin / _src / _test / _matrix) * The TAP groups system: groups.json, BASE_GROUP stripping, infras.lst, env.sh, SKIP_PROXYSQL mode for unit tests * Full workflow catalogue as of 2026-04-11 with status per file * End-to-end walkthrough for adding a new test group * Common pitfalls: merge ordering, workflow_run head_sha propagation, the empty-matrix wedge, the LouisBrunner/checks-action false-positive permissions hotspot * Debugging guide: how to trace a failing run through CI-trigger → CI-builds → test workflow, and how to reproduce locally * Glossary Adds a Mermaid sequence diagram for the execution flow and inline ASCII diagrams throughout. Also updates CLAUDE.md to point agents at doc/GH-Actions/README.md before they touch anything under .github/workflows/.
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughMultiple CI caller workflows were updated to invoke dedicated per-group reusable workflows on the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request significantly expands the documentation for the ProxySQL CI/CD architecture, detailing the two-branch workflow split between caller and reusable files. It provides a comprehensive guide on execution flow, build matrices, and the TAP groups system. The review feedback suggests refining the instructions for creating new test workflows to ensure that the database version used in the matrix is updated alongside the group name, preventing misleading labels in the GitHub Check Run UI.
| * `name:` → `CI-<group>` | ||
| * `INFRA_ID="ci-<group>"` (in steps 7, 8, 9) | ||
| * `TAP_GROUP="<group>"` (in steps 7, 8, 9) | ||
| * Step name `Run <group> tests` | ||
| * Cleanup `docker logs proxysql.ci-<group>` | ||
| * `TAP_USE_NOISE=1` → keep it or drop it (g4 has it for race testing; other | ||
| groups leave it off unless the group has known flakiness that noise | ||
| injection helps surface) |
There was a problem hiding this comment.
The instructions for creating a sibling workflow from the ci-legacy-g4.yml template should include updating the matrix.infradb value. Although described as "cosmetic", this value is used to define the GitHub Check Run name (via env.MATRIX). If left as mysql57 while the TAP_GROUP is set to a different version (e.g., mysql84), the CI results will be reported under a misleading check name in the PR UI, which can be confusing for reviewers.
| * `name:` → `CI-<group>` | |
| * `INFRA_ID="ci-<group>"` (in steps 7, 8, 9) | |
| * `TAP_GROUP="<group>"` (in steps 7, 8, 9) | |
| * Step name `Run <group> tests` | |
| * Cleanup `docker logs proxysql.ci-<group>` | |
| * `TAP_USE_NOISE=1` → keep it or drop it (g4 has it for race testing; other | |
| groups leave it off unless the group has known flakiness that noise | |
| injection helps surface) | |
| * `name:` → `CI-<group>` | |
| * `matrix.infradb` → the primary database version (e.g., `mysql84`) | |
| * `INFRA_ID="ci-<group>"` (in steps 7, 8, 9) | |
| * `TAP_GROUP="<group>"` (in steps 7, 8, 9) | |
| * Step name `Run <group> tests` | |
| * Cleanup `docker logs proxysql.ci-<group>` | |
| * `TAP_USE_NOISE=1` → keep it or drop it (g4 has it for race testing; other | |
| groups leave it off unless the group has known flakiness that noise | |
| injection helps surface) |
| ```bash | ||
| # on the GH-Actions branch: | ||
| cd .github/workflows | ||
| sed "s/legacy-g4/mysql90-g1/g" ci-legacy-g4.yml > ci-mysql90-g1.yml |
There was a problem hiding this comment.
The sed command provided here only updates the group name but leaves the infradb matrix value as mysql57. To ensure the GitHub Check Run is named correctly for the new group, the sed command should also replace the database version string.
| sed "s/legacy-g4/mysql90-g1/g" ci-legacy-g4.yml > ci-mysql90-g1.yml | |
| sed "s/legacy-g4/mysql90-g1/g; s/mysql57/mysql90/g" ci-legacy-g4.yml > ci-mysql90-g1.yml |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/GH-Actions/README.md`:
- Around line 44-52: The README contains multiple untyped fenced code blocks
that trigger markdownlint MD040; update each untyped fence in the
doc/GH-Actions/README.md README to include a language identifier (use text) so
fences become ```text ... ```; ensure you update every occurrence mentioned in
the review (including the examples around the branch/CI diagram and the other
untyped blocks elsewhere) so all untyped triple-backtick fences are converted to
triple-backtick with text to satisfy the linter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b49492b4-6ed7-4a39-87e5-d0dd9ea14522
📒 Files selected for processing (11)
.github/workflows/CI-legacy-g1.yml.github/workflows/CI-legacy-g3.yml.github/workflows/CI-legacy-g5.yml.github/workflows/CI-mysql84-g1.yml.github/workflows/CI-mysql84-g2.yml.github/workflows/CI-mysql84-g3.yml.github/workflows/CI-mysql84-g4.yml.github/workflows/CI-mysql84-g5.yml.github/workflows/CI-unittests.ymlCLAUDE.mddoc/GH-Actions/README.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: claude-review
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-11T05:18:01.993Z
Learning: Use RAII (Resource Acquisition Is Initialization) for resource management
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-11T05:18:01.993Z
Learning: Use jemalloc for memory allocation
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-11T05:18:01.993Z
Learning: Consider implications of changes to performance-critical code paths
Learnt from: CR
URL:
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-11T05:18:01.993Z
Learning: Build system is GNU Make-based with three-stage pipeline: `deps` → `lib` → `src`
📚 Learning: 2026-04-01T21:27:03.216Z
Learnt from: wazir-ahmed
Repo: sysown/proxysql PR: 5557
File: test/tap/tests/unit/gtid_set_unit-t.cpp:14-17
Timestamp: 2026-04-01T21:27:03.216Z
Learning: In ProxySQL's unit test directory (test/tap/tests/unit/), test_globals.h and test_init.h are only required for tests that depend on the ProxySQL runtime globals/initialization (i.e., tests that exercise components linked against libproxysql.a). Pure data-structure or utility tests (e.g., ezoption_parser_unit-t.cpp, gtid_set_unit-t.cpp, gtid_trxid_interval_unit-t.cpp) only need tap.h and the relevant project header — omitting test_globals.h and test_init.h is correct and intentional in these cases.
Applied to files:
CLAUDE.md
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-22T14:38:16.093Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.093Z
Learning: Applies to test/tap/tests/unit/**/*.cpp : Unit tests in test/tap/tests/unit/ must use test_globals.h and test_init.h and link against libproxysql.a via the custom test harness
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-22T14:38:16.093Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.093Z
Learning: Feature tiers are controlled by build flags: PROXYSQL31=1 for v3.1.x, PROXYSQLGENAI=1 for v4.0.x; PROXYSQLGENAI=1 implies PROXYSQL31=1
Applied to files:
doc/GH-Actions/README.md
🪛 LanguageTool
doc/GH-Actions/README.md
[uncategorized] ~10-~10: The official name of this software platform is spelled with a capital “H”.
Context: ... pitfalls. If you touch anything under .github/workflows/ on either v3.0 or the `GH...
(GITHUB)
[uncategorized] ~130-~130: The official name of this software platform is spelled with a capital “H”.
Context: ...son(github) }}— serialises the entire github` context as JSON and hands it to the re...
(GITHUB)
[uncategorized] ~251-~251: The official name of this software platform is spelled with a capital “H”.
Context: ...rkflow_run-triggered workflow receives github.event.workflow_run.head_sha` equal to t...
(GITHUB)
[uncategorized] ~289-~289: The official name of this software platform is spelled with a capital “H”.
Context: ...ctions | | | |---|---| | **Caller** |v3.0:.github/workflows/CI-trigger.yml` | | **Reusabl...
(GITHUB)
[uncategorized] ~290-~290: The official name of this software platform is spelled with a capital “H”.
Context: ...lows/CI-trigger.yml| | **Reusable** |GH-Actions:.github/workflows/ci-trigger.yml` | | **Trigger...
(GITHUB)
[uncategorized] ~292-~292: The official name of this software platform is spelled with a capital “H”.
Context: ...kflow_dispatch| | **Paths ignored** |.github/, .md` (so doc-only PRs don't bur...
(GITHUB)
[uncategorized] ~311-~311: The official name of this software platform is spelled with a capital “H”.
Context: ...ctions | | | |---|---| | **Caller** |v3.0:.github/workflows/CI-builds.yml` | | **Reusable...
(GITHUB)
[uncategorized] ~312-~312: The official name of this software platform is spelled with a capital “H”.
Context: ...flows/CI-builds.yml| | **Reusable** |GH-Actions:.github/workflows/ci-builds.yml` | | **Triggers...
(GITHUB)
[style] ~562-~562: Replacing this phrase with a shorter alternative might make your text sound more refined.
Context: ... etc.) and a list of test binaries that belong to it. One test binary may belong to multiple...
(BELONG_TO_PRP)
[style] ~651-~651: Consider using a different adverb to strengthen your wording.
Context: ...roup env.sh sourcing above), it takes a completely different code path: it reads `groups.j...
(COMPLETELY_ENTIRELY)
[uncategorized] ~826-~826: The official name of this software platform is spelled with a capital “H”.
Context: ...on: workflow_run, workflow B receives github.event.workflow_run.head_sha equal to *...
(GITHUB)
[uncategorized] ~863-~863: The official name of this software platform is spelled with a capital “H”.
Context: ...n's raw logs: the first job step prints `Uses: sysown/proxysql/.github/workflows/ci-legacy-g1.yml@refs/heads/G...
(GITHUB)
[uncategorized] ~984-~984: The official name of this software platform is spelled with a capital “H”.
Context: ...gger that lets a workflow be invoked by uses: owner/repo/.github/workflows/x.yml@ref. This is what make...
(GITHUB)
🪛 markdownlint-cli2 (0.22.0)
doc/GH-Actions/README.md
[warning] 44-44: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 211-211: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 547-547: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 588-588: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 761-761: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 840-840: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 893-893: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (11)
.github/workflows/CI-mysql84-g2.yml (1)
17-17: Dedicated reusable mapping is correct.Line 17 correctly rewires this caller to the
mysql84-g2reusable, consistent with the new CI architecture..github/workflows/CI-legacy-g5.yml (1)
17-17: Looks good.Line 17 correctly points
CI-legacy-g5to its dedicated reusable workflow..github/workflows/CI-mysql84-g3.yml (1)
17-17: Correct workflow target.Line 17 now references the expected
mysql84-g3reusable and aligns with the refactor intent..github/workflows/CI-mysql84-g4.yml (1)
17-17: Approved.Line 17 correctly rewires this caller to
ci-mysql84-g4.yml@GH-Actions..github/workflows/CI-mysql84-g5.yml (1)
17-17: LGTM.Line 17 uses the correct dedicated
mysql84-g5reusable workflow..github/workflows/CI-legacy-g3.yml (1)
17-17: Good change.Line 17 correctly targets
ci-legacy-g3.yml@GH-Actions..github/workflows/CI-mysql84-g1.yml (1)
17-17: Approved.Line 17 correctly maps
CI-mysql84-g1to its dedicated reusable workflow..github/workflows/CI-legacy-g1.yml (1)
17-17: Looks correct.Line 17 now points to the dedicated
legacy-g1reusable as expected..github/workflows/CI-unittests.yml (2)
1-20: Looks good: caller workflow follows the established delegation pattern.Triggering, concurrency scoping, success gating, and
triggerpassthrough are consistent with the existing CI caller architecture.
17-17: Verify reusable workflow exists onGH-Actionsbranch before merge.This caller hard-depends on
.github/workflows/ci-unittests.yml@GH-Actions; if absent, runs fail withUnable to resolve action. Ensure the file exists on theGH-Actionsbranch.CLAUDE.md (1)
135-135: Great addition: this makes CI workflow changes safer and more consistent.Linking contributors directly to the authoritative CI architecture doc is the right guardrail.
The five ci-mysql84-g*.yml reusables introduced in #5597 inherited `infradb: [ 'mysql57' ]` from ci-legacy-g4.yml as the sed template. The `infradb` matrix value is interpolated into `env.MATRIX` and then into the check-action step's `name` field, so the GitHub PR UI would display check runs as `CI-mysql84-g3 / tests (mysql57)` even though the tests themselves are correctly running against infra-mysql84. The infrastructure resolution is unaffected -- `ensure-infras.bash` strips the -gN suffix from TAP_GROUP=mysql84-g3 to get BASE_GROUP=mysql84, which resolves to test/tap/groups/mysql84/infras.lst (infra-mysql84). This is purely a display-name fix. Set infradb to 'mysql84' in all five files so the check run name matches the backend actually under test. Caught by gemini-code-assist review on #5598.
Addresses review feedback on #5598: 1. Adds `matrix.infradb` to the "What a sibling differs in" checklist. Gemini-code-assist flagged it -- and it was also a real bug in the merged ci-mysql84-g{1..5}.yml files, fixed in the separate #5600. The doc now explains that `infradb` is cosmetic for routing but displayed in the PR UI via env.MATRIX, so leaving it at the template default creates a misleading "CI-mysql84-g3 / tests (mysql57)" label. 2. Updates the sed example in "Adding a new test group" to substitute the infradb string alongside the group name, so copy-pasting from the doc produces a correct file on the first try. 3. Adds `text` language tag to seven previously untyped fenced code blocks (ASCII diagrams, directory listings, plain-text output) to satisfy markdownlint MD040. CodeRabbit flagged these. The repo doesn't enforce markdownlint, but the tag is harmless and silences the bot for future contributors editing the doc.
Validating that the full set of test workflows now fires on PR branches after the PR #5598 caller rewiring landed. No code change.
Summary
Second half of the CI repair started in #5597. With #5597 merged into `GH-Actions` (landing the nine dedicated reusable workflows), this PR:
Why the rewiring is needed
Before this PR, all 8 group callers routed to `ci-taptests-groups.yml@GH-Actions`, passing `testgroup: '[""]'` as a hint. That reusable's design is:
In recent `CI-builds` runs the `proxysql/tap-matrix.json` in the cache is `[ ]` — `find test/tap/ -name '*-t'` in the build step returns nothing and you can see `dirname: missing operand` in the CI-builds log. With an empty matrix vector, GitHub Actions rejects strategy evaluation with:
…and the `tests` job never instantiates. The run wedges on "Waiting for pending jobs" for ~48 minutes and then fails with only the `select` job visible. Example run.
This PR routes each caller to its own dedicated reusable (cut from `ci-legacy-g4.yml`, which has been working in production for weeks), which drives tests via `TAP_GROUP` + `test/infra/control/*` scripts and never touches `tap-matrix.json`.
Per-file changes
8 caller rewires
Each of these drops the `testgroup:` input line and swaps the `uses:` target from `ci-taptests-groups.yml@GH-Actions` to `ci-.yml@GH-Actions`. Diff per file is exactly 2 lines removed + 1 line changed.
`.github/workflows/CI-unittests.yml` (new)
Standard 20-line caller targeting `ci-unittests.yml@GH-Actions`. Triggered on `workflow_run[completed]` on `CI-trigger` or on manual `workflow_dispatch`.
`doc/GH-Actions/README.md` (rewritten)
The previous version was ~380 lines and substantially outdated — most of the currently-working test workflows weren't listed, and it still marked entire groups of workflows as "broken (#5521)" when they've since been migrated.
The new version is ~800 lines, structured as 12 top-level sections:
`CLAUDE.md`
Adds a one-line pointer to `doc/GH-Actions/README.md` so AI agents find the doc before touching `.github/workflows/`.
Order of operations
#5597 must be merged first. It is already merged at the time of this PR being opened, so this should be safe.
If it isn't: do not merge this. These 9 callers will start resolving `ci-.yml@GH-Actions` immediately on the next push; if those files don't exist yet on `GH-Actions`, every run errors out with `Unable to resolve action`.
Test plan
Verify ci: add missing per-group reusable workflows (legacy-g{1,3,5}, mysql84-g{1..5}, unit-tests) #5597 is merged and the nine `ci-*.yml` files exist on `origin/GH-Actions`.
Merge this PR into `v3.0`.
On the next push to `v3.0` or a PR branch, verify that each of the 9 workflows:
actually runs its `tests (mysql57)` / `tests (mysql84)` / host-only unit-tests job, not just a lone `select` job.
Spot-check one legacy and one mysql84 run: confirm `ensure-infras.bash` picks up the right backend from `test/tap/groups/{legacy,mysql84}/infras.lst`.
Confirm `CI-unittests` invokes the SKIP_PROXYSQL host-only path and prints a "N/42 passed" summary.
Follow-ups (out of scope)
Summary by CodeRabbit
Chores
New Features
Documentation