Skip to content

test(read_only_offline_hard): set default_hostgroup to WHG, not hardcoded 0#5613

Merged
renecannao merged 1 commit intov3.0from
v3.0-fix-read-only-actions-hostgroup
Apr 11, 2026
Merged

test(read_only_offline_hard): set default_hostgroup to WHG, not hardcoded 0#5613
renecannao merged 1 commit intov3.0from
v3.0-fix-read-only-actions-hostgroup

Conversation

@renecannao
Copy link
Copy Markdown
Contributor

@renecannao renecannao commented Apr 11, 2026

Part of #5610 — third of four tests from the flake tracking issue.

Correction to #5610's premise: like pgsql-ssl_keylog-t in PR #5612, this turned out to be a deterministic failure miscategorized as a flake. Reproduced 0/3 failing locally before the fix, 3/3 passing after. Not a timing race.

Root cause — an incomplete hostgroup migration

Two commits on v3.0 main set up the stage:

  1. 26c5a2572 ("Fix read_only test assuming default_hostgroup=0"): fixed the ORIGINAL bug where the test's writer hostgroup was 0 but the CI infra set user.default_hostgroup to 1300. The fix ran UPDATE mysql_users SET default_hostgroup=0 at the start of each scenario to point the user at the test's servers.

  2. 4f9ab49e7 ("Fix hardcoded hostgroups in TAP tests for mysql84"): introduced TAP_MYSQL8_BACKEND_HG env var so the test's writer/reader hostgroups could be overridden — WHG=2900, RHG=2901 in the mysql84-g4 group. init_hostgroups() at file scope reads this env var into the file-scope WHG/RHG static symbols.

The second commit updated the INSERT INTO mysql_servers calls to use WHG/RHG — but missed the UPDATE mysql_users SET default_hostgroup=0 added by the first commit. So on mysql84-g4:

  • insert_mysql_servers_records() puts servers in hostgroup 2900 (WHG)
  • The legacy UPDATE mysql_users SET default_hostgroup=0 still runs, pointing the user at hostgroup 0, which has no servers
  • The test opens a proxy connection as that user, runs BEGIN, and ProxySQL routes BEGIN to hostgroup 0 — empty
  • 10 seconds later: Max connect timeout reached while reaching hostgroup 0 after 10000ms
  • Test aborts at subtest 4 of 18

Verified in the running proxysql during a failing run:

SELECT username, default_hostgroup FROM mysql_users;
-- root     | 0       <-- still 0 after the setup step tried to match
-- user     | 2900
-- testuser | 2900
-- ...

Fix

Change the two string_format("UPDATE mysql_users SET default_hostgroup=0 ...") calls (in test_scenario_1 and test_scenario_2) to use the WHG static symbol — already populated from TAP_MYSQL8_BACKEND_HG by init_hostgroups() earlier in the same file. Zero behavior change on the legacy infra (where WHG defaults to 0, so the literal value is unchanged); correct behavior on the mysql84 infra (where WHG=2900).

Diff is ~10 lines of real change plus a comment explaining the history so the next maintainer doesn't hit this again.

Local verification

3 consecutive iterations in mysql84-g4 + TEST_PY_TAP_INCL=test_read_only_actions_offline_hard_servers-t:

attempt 1: PASS
attempt 2: PASS
attempt 3: PASS
=== read_only_actions: 3/3 ===

Pre-fix: 0/3 — every run hit Max connect timeout reached while reaching hostgroup 0 after 10000ms and aborted at subtest 4 of 18.

Post-fix: all 18 subtests run and pass on every iteration. Test log confirms ok 18 is reached.

Test plan

  • c++ -c compiles cleanly (no string_format template issues with the new WHG int argument)
  • 3/3 local passes in mysql84-g4, all 18 subtests executed per run
  • Not regressed on legacy-g4 (where WHG defaults to 0 and the produced SQL is identical to the pre-fix literal)
  • CI green on CI-mysql84-g4 against this PR's HEAD
  • CI still green on CI-legacy-g4 against this PR's HEAD (defensive check)

Related

Summary by CodeRabbit

  • Tests
    • Updated test scenarios to properly validate writer hostgroup routing behavior in offline server conditions.

…oded 0

Part of #5610 — third of four tests from the flake
tracking issue. Like pgsql-ssl_keylog-t, this turned out to be a
deterministic failure miscategorized as a flake.

## Background: the incomplete hostgroup migration

Two commits on v3.0 main set up the stage:

1. 26c5a25 ("Fix read_only test assuming default_hostgroup=0"):
   fixed the ORIGINAL bug where the test's writer hostgroup was 0
   but the CI infra set user.default_hostgroup to 1300. The fix set
   default_hostgroup=0 at the start of each scenario so the user
   routing matched the test's servers.

2. 4f9ab49 ("Fix hardcoded hostgroups in TAP tests for mysql84"):
   introduced TAP_MYSQL8_BACKEND_HG env var so the test's writer/
   reader hostgroups could be overridden for mysql84 (WHG=2900, RHG=
   2901 in the mysql84-g4 group). init_hostgroups() at file scope
   reads this env var into the static `WHG` and `RHG` symbols.

The second commit updated the `INSERT INTO mysql_servers` calls to
use WHG/RHG, but missed the `UPDATE mysql_users SET default_hostgroup=0`
added by the first commit. So on mysql84-g4:

 - insert_mysql_servers_records() puts servers in hostgroup 2900 (WHG)
 - UPDATE mysql_users SET default_hostgroup=0 points the user at
   hostgroup 0, which has no servers
 - the test opens a proxy connection as that user, runs BEGIN, and
   ProxySQL routes BEGIN to hostgroup 0 — which is empty
 - 10 s later: "Max connect timeout reached while reaching hostgroup 0"
 - test aborts at subtest 4 of 18, not a flake

Verified by querying the running proxysql during a failing run:

  SELECT username, default_hostgroup FROM mysql_users;
  -> root     | 0
     user     | 2900
     testuser | 2900
     ...

 `root` is the user the test connects as (cl.root_username). Still
 at 0 after the setup step tried to make it match. In mysql84-g4
 that's wrong.

## Fix

Change the two `string_format("UPDATE mysql_users SET default_hostgroup=0 ...")`
calls in test_scenario_1 and test_scenario_2 to use the `WHG` static
symbol (already populated from TAP_MYSQL8_BACKEND_HG by
init_hostgroups() earlier in the same file), so the user's
default_hostgroup always matches whatever writer hostgroup the test
configured its servers into. Zero behavior change on the legacy infra
(where WHG defaults to 0 so the literal value is unchanged); correct
behavior on the mysql84 infra (where WHG=2900).

Same two-line pattern in both scenario functions.

## Local verification (dogfood of test/README.md §"Debugging a flaky test")

3 consecutive local iterations in mysql84-g4 +
TEST_PY_TAP_INCL=test_read_only_actions_offline_hard_servers-t:

  attempt 1: PASS
  attempt 2: PASS
  attempt 3: PASS
  === read_only_actions: 3/3 ===

Pre-fix: 0/3 (all hit the "Max connect timeout reached while reaching
hostgroup 0 after 10000ms" error and aborted at subtest 4 of 18).
Post-fix: 18/18 subtests run and pass on every iteration.

## Not touched here

- `pgsql-servers_ssl_params-t` — still stuck on the pgsql monitor
  not making SSL connections under `use_ssl=1`. Same open question
  as subtest 7 of pgsql-ssl_keylog-t (skipped in PR #5612). Needs
  pgsql monitor source analysis, not a test-side fix.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

📝 Walkthrough

Walkthrough

The test file updates mysql_users.default_hostgroup configuration in test scenarios by replacing hardcoded value 0 with the runtime writer hostgroup parameter WHG. This ensures the test's routing behavior aligns with the dynamically configured writer hostgroup instead of using a fixed value.

Changes

Cohort / File(s) Summary
Test Hostgroup Parameterization
test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
Replace hardcoded default_hostgroup value of 0 with runtime writer hostgroup variable WHG in test scenario setup for both test_scenario_1 and test_scenario_2 to align with dynamically configured hostgroups.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 ✨ No more zeros carved in stone,
WHG's the hostgroup we have known!
Runtime values flow through the test,
The routing setup stands its best! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: replacing hardcoded hostgroup value 0 with the WHG parameter in the test's UPDATE mysql_users statements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch v3.0-fix-read-only-actions-hostgroup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the test suite to dynamically set the default hostgroup using the WHG variable instead of a hardcoded value, which prevents query timeouts in certain environments. The reviewer suggested refactoring the duplicated update logic in test_scenario_1 and test_scenario_2 into a shared helper function to improve maintainability and reduce code duplication.

Comment on lines +226 to 241
// Set default_hostgroup to match writer_hostgroup (WHG) used in this
// test. WHG defaults to 0 for the legacy infra but is overridden to
// 2900 by the mysql84-g4 group via TAP_MYSQL8_BACKEND_HG — see
// init_hostgroups() above. If we hardcode 0 here, the mysql84 group
// routes the test's BEGIN query to an empty hostgroup 0 and times
// out at 10 s. A previous fix (26c5a2572) hardcoded 0 because it
// predated the mysql84 hostgroup migration in 4f9ab49e7 — this
// commit finishes that fix by making the default_hostgroup follow
// WHG at runtime.
{
std::string update_user;
string_format("UPDATE mysql_users SET default_hostgroup=0 WHERE username='%s'", update_user, cl.root_username);
string_format("UPDATE mysql_users SET default_hostgroup=%d WHERE username='%s'",
update_user, WHG, cl.root_username);
MYSQL_QUERY__(proxy_admin, update_user.c_str());
MYSQL_QUERY__(proxy_admin, "LOAD MYSQL USERS TO RUNTIME");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code, including the detailed comment, is duplicated in test_scenario_2 (lines 396-411). To improve maintainability and reduce code duplication, consider extracting this logic into a helper function. This would also centralize the logic for updating the default hostgroup, making future changes easier.

For example, you could create a function like this:

int update_default_hostgroup(MYSQL* proxy_admin, const CommandLine& cl) {
    // Set default_hostgroup to match writer_hostgroup (WHG) used in this
    // test. WHG defaults to 0 for the legacy infra but is overridden to
    // 2900 by the mysql84-g4 group via TAP_MYSQL8_BACKEND_HG — see
    // init_hostgroups() above. If we hardcode 0 here, the mysql84 group
    // routes the test's BEGIN query to an empty hostgroup 0 and times
    // out at 10 s. A previous fix (26c5a2572) hardcoded 0 because it
    // predated the mysql84 hostgroup migration in 4f9ab49e7 — this
    // commit finishes that fix by making the default_hostgroup follow
    // WHG at runtime.
    std::string update_user;
    string_format("UPDATE mysql_users SET default_hostgroup=%d WHERE username='%s'",
                  update_user, WHG, cl.root_username);
    if (mysql_query(proxy_admin, update_user.c_str())) {
        fprintf(stderr, "File %s, line %d, Error: %s\n", __FILE__, __LINE__, mysql_error(proxy_admin));
        return EXIT_FAILURE;
    }
    if (mysql_query(proxy_admin, "LOAD MYSQL USERS TO RUNTIME")) {
        fprintf(stderr, "File %s, line %d, Error: %s\n", __FILE__, __LINE__, mysql_error(proxy_admin));
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

Then, you can replace the duplicated blocks in test_scenario_1 and test_scenario_2 with:

if (update_default_hostgroup(proxy_admin, cl) != EXIT_SUCCESS) {
    goto cleanup;
}

This avoids using the MYSQL_QUERY__ macro which relies on a goto cleanup statement, making the helper function more self-contained.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp (1)

226-234: Optional: reduce duplicated migration rationale/setup comments.

The long explanatory block is duplicated in both scenarios. Consider extracting a small helper (or a shared comment near a helper) to keep this test easier to maintain.

Also applies to: 396-404

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp` around
lines 226 - 234, The large duplicated migration/setup comment about
default_hostgroup and WHG should be centralized: remove the repeated block in
the test and replace it with a short one-line pointer, then add the full
explanation once near the shared helper or the init_hostgroups() helper where
TAP_MYSQL8_BACKEND_HG and writer_hostgroup behavior is defined (or next to the
mysql84-g4/WHG explanation). Update occurrences referencing default_hostgroup
and writer_hostgroup to refer to that shared comment/helper so the rationale is
maintained without duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp`:
- Around line 226-234: The large duplicated migration/setup comment about
default_hostgroup and WHG should be centralized: remove the repeated block in
the test and replace it with a short one-line pointer, then add the full
explanation once near the shared helper or the init_hostgroups() helper where
TAP_MYSQL8_BACKEND_HG and writer_hostgroup behavior is defined (or next to the
mysql84-g4/WHG explanation). Update occurrences referencing default_hostgroup
and writer_hostgroup to refer to that shared comment/helper so the rationale is
maintained without duplication.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3250360-11cf-4bba-98f6-dbb247d0219c

📥 Commits

Reviewing files that changed from the base of the PR and between 04466b3 and 5d35a21.

📒 Files selected for processing (1)
  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{cpp,h,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

Use UPPER_SNAKE_CASE for constants and macros

Files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
test/tap/tests/**/*(test_*.cpp|*-t.cpp)

📄 CodeRabbit inference engine (CLAUDE.md)

Test files follow the naming pattern test_*.cpp or *-t.cpp

Files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
test/tap/**/*.cpp

📄 CodeRabbit inference engine (CLAUDE.md)

Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure

Files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
🧠 Learnings (3)
📚 Learning: 2026-04-01T21:27:03.216Z
Learnt from: wazir-ahmed
Repo: sysown/proxysql PR: 5557
File: test/tap/tests/unit/gtid_set_unit-t.cpp:14-17
Timestamp: 2026-04-01T21:27:03.216Z
Learning: In ProxySQL's unit test directory (test/tap/tests/unit/), test_globals.h and test_init.h are only required for tests that depend on the ProxySQL runtime globals/initialization (i.e., tests that exercise components linked against libproxysql.a). Pure data-structure or utility tests (e.g., ezoption_parser_unit-t.cpp, gtid_set_unit-t.cpp, gtid_trxid_interval_unit-t.cpp) only need tap.h and the relevant project header — omitting test_globals.h and test_init.h is correct and intentional in these cases.

Applied to files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
📚 Learning: 2026-01-20T07:40:34.938Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:24-28
Timestamp: 2026-01-20T07:40:34.938Z
Learning: In ProxySQL test files, calling `mysql_error(NULL)` after `mysql_init()` failure is safe because the MariaDB client library implementation returns an empty string for NULL handles (not undefined behavior).

Applied to files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
📚 Learning: 2026-01-20T09:34:19.124Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:19.124Z
Learning: In ProxySQL's TAP test suite, resource leaks (e.g., not calling mysql_close() on early return paths) are commonly tolerated because test processes are short-lived and OS frees resources on exit. This pattern applies to all C++ test files under test/tap/tests. When reviewing, recognize this as a project-wide test convention and focus on test correctness and isolation rather than insisting on fixing such leaks in these test files.

Applied to files:

  • test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp
🔇 Additional comments (2)
test/tap/tests/test_read_only_actions_offline_hard_servers-t.cpp (2)

226-238: Good fix: default_hostgroup now follows runtime WHG in scenario 1.

This correctly aligns user routing with the writer hostgroup configured for the test environment and preserves legacy behavior when WHG=0.


396-408: Good fix: scenario 2 now consistently uses WHG for default_hostgroup.

Nice consistency with scenario 1; this removes the hardcoded hostgroup mismatch that was causing deterministic failures on mysql84-g4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant