Summary
When writer_is_also_reader=1 is configured in mysql_aws_aurora_hostgroups, the writer instance is not added to the reader_hostgroup during normal steady-state operation. The writer only appears in the reader_hostgroup after a failover event triggers update_aws_aurora_set_writer().
This means that if the only dedicated reader goes down, reader_hostgroup becomes empty and read queries fail — defeating the purpose of writer_is_also_reader=1 as a read fallback.
Environment
- ProxySQL version: 3.0.7
- Aurora MySQL: Serverless v2 (8.0.44 compatible)
- Deployment: Kubernetes StatefulSet (3 replicas) with
mysql_aws_aurora_hostgroups autodiscovery
- Aurora topology: 1 writer + 1 reader (tested), also reproducible with 1 writer + 2 readers
Configuration
-- mysql_aws_aurora_hostgroups
INSERT INTO mysql_aws_aurora_hostgroups (
writer_hostgroup, reader_hostgroup, active, aurora_port,
domain_name, max_lag_ms, check_interval_ms, check_timeout_ms,
writer_is_also_reader, new_reader_weight
) VALUES (
10, 20, 1, 3306,
'.abc123def456.us-east-1.rds.amazonaws.com',
30000, 1000, 3000,
1, 1
);
-- mysql_servers (seeds in writer hostgroup only)
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES
(10, 'myapp-writer.abc123def456.us-east-1.rds.amazonaws.com', 3306),
(10, 'myapp-reader.abc123def456.us-east-1.rds.amazonaws.com', 3306);
LOAD MYSQL SERVERS TO RUNTIME;
Expected Behavior
After autodiscovery runs and identifies the writer and reader:
runtime_mysql_servers:
HG10 (writer): myapp-writer ONLINE
HG20 (reader): myapp-reader ONLINE
HG20 (reader): myapp-writer ONLINE ← writer as fallback reader
The writer should appear in both writer_hostgroup (10) and reader_hostgroup (20), as documented:
"If mysql_aws_aurora_hostgroups.writer_is_also_reader is enabled, then the writer is also configured in the reader hostgroup."
Actual Behavior
runtime_mysql_servers:
HG10 (writer): myapp-writer ONLINE
HG20 (reader): myapp-reader ONLINE
The writer is only in HG10. It does not appear in HG20 despite writer_is_also_reader=1.
Root Cause Analysis
We traced the issue through the source code (MySQL_HostGroups_Manager.cpp):
-
aws_aurora_replication_lag_action() (line ~6591): During steady-state monitoring, this function only shuns/enables servers based on lag. It does not add or remove servers from hostgroups. It returns a boolean indicating whether the server is in the correct hostgroup.
-
update_aws_aurora_set_writer() (line ~6775): This function contains the writer_is_also_reader logic that adds the writer to the reader hostgroup. However, it has a short-circuit condition (lines ~6836-6845):
if (found_writer) {
if (
(writer_is_also_reader==0 && found_reader==false)
||
(writer_is_also_reader > 0 && found_reader==true)
) {
// already correct, no action needed
delete resultset;
resultset=NULL;
}
}
During initial boot, the writer is found_writer=true (it's a seed in HG10) but found_reader=false (no seed in HG20). With writer_is_also_reader=1, the condition (writer_is_also_reader > 0 && found_reader==true) is false, so the function should proceed to the rebuild phase. However, this function is only called when aws_aurora_replication_lag_action() returns false — i.e., when it detects the server is in the wrong hostgroup. Since the writer IS in the writer hostgroup (HG10), aws_aurora_replication_lag_action() returns true, and update_aws_aurora_set_writer() is never called.
-
create_new_server_in_hg() (line ~6946): This path correctly handles writer_is_also_reader:
int wr_res = create_new_server_in_hg(_writer_hostgroup, srv_info, wr_srv_opts);
if (writer_is_also_reader && read_HG >= 0) {
rd_res = create_new_server_in_hg(read_HG, srv_info, rd_srv_opts);
}
But this path is only taken when the server is completely new (not found in any hostgroup). Since the writer is a seed in HG10, it's already known, so this path is never reached.
The Gap
The writer_is_also_reader check exists in two code paths:
update_aws_aurora_set_writer(): Has the logic but is only called during failover (topology change)
create_new_server_in_hg(): Has the logic but is only called for new servers (not seeds)
Neither path is reached during steady-state operation when the writer is already a seed in the writer hostgroup. The monitoring loop (aws_aurora_replication_lag_action()) sees the writer in HG10, considers it correct, and never triggers either function.
Workaround
Pre-seed the writer in both hostgroups:
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES
(10, 'myapp-writer.abc123def456.us-east-1.rds.amazonaws.com', 3306),
(20, 'myapp-writer.abc123def456.us-east-1.rds.amazonaws.com', 3306);
This is fragile because the writer identity can change after a failover, requiring the seed configuration to be updated.
Suggested Fix
After the initial autodiscovery cycle identifies the writer, check if writer_is_also_reader > 0 and the writer is missing from the reader_hostgroup. If so, call the insertion logic from update_aws_aurora_set_writer() or create_new_server_in_hg() to add the writer to the reader hostgroup.
This could be done in evaluate_aws_aurora_results() after the topology is determined, or as an additional check in aws_aurora_replication_lag_action().
Impact
- Read queries fail when the only dedicated reader goes down, even though
writer_is_also_reader=1 was specifically configured to prevent this
- The documented behavior does not match the actual behavior
- Users must use fragile workarounds (pre-seeding writer in reader hostgroup) that break after failover
Steps to Reproduce
- Configure
mysql_aws_aurora_hostgroups with writer_is_also_reader=1
- Seed only the writer hostgroup in
mysql_servers
LOAD MYSQL SERVERS TO RUNTIME
- Wait for autodiscovery to complete (10+ seconds)
- Check
runtime_mysql_servers — writer will only be in the writer hostgroup
Summary
When
writer_is_also_reader=1is configured inmysql_aws_aurora_hostgroups, the writer instance is not added to thereader_hostgroupduring normal steady-state operation. The writer only appears in thereader_hostgroupafter a failover event triggersupdate_aws_aurora_set_writer().This means that if the only dedicated reader goes down,
reader_hostgroupbecomes empty and read queries fail — defeating the purpose ofwriter_is_also_reader=1as a read fallback.Environment
mysql_aws_aurora_hostgroupsautodiscoveryConfiguration
Expected Behavior
After autodiscovery runs and identifies the writer and reader:
The writer should appear in both
writer_hostgroup(10) andreader_hostgroup(20), as documented:Actual Behavior
The writer is only in HG10. It does not appear in HG20 despite
writer_is_also_reader=1.Root Cause Analysis
We traced the issue through the source code (
MySQL_HostGroups_Manager.cpp):aws_aurora_replication_lag_action()(line ~6591): During steady-state monitoring, this function only shuns/enables servers based on lag. It does not add or remove servers from hostgroups. It returns a boolean indicating whether the server is in the correct hostgroup.update_aws_aurora_set_writer()(line ~6775): This function contains thewriter_is_also_readerlogic that adds the writer to the reader hostgroup. However, it has a short-circuit condition (lines ~6836-6845):During initial boot, the writer is
found_writer=true(it's a seed in HG10) butfound_reader=false(no seed in HG20). Withwriter_is_also_reader=1, the condition(writer_is_also_reader > 0 && found_reader==true)is false, so the function should proceed to the rebuild phase. However, this function is only called whenaws_aurora_replication_lag_action()returnsfalse— i.e., when it detects the server is in the wrong hostgroup. Since the writer IS in the writer hostgroup (HG10),aws_aurora_replication_lag_action()returnstrue, andupdate_aws_aurora_set_writer()is never called.create_new_server_in_hg()(line ~6946): This path correctly handleswriter_is_also_reader:But this path is only taken when the server is completely new (not found in any hostgroup). Since the writer is a seed in HG10, it's already known, so this path is never reached.
The Gap
The
writer_is_also_readercheck exists in two code paths:update_aws_aurora_set_writer(): Has the logic but is only called during failover (topology change)create_new_server_in_hg(): Has the logic but is only called for new servers (not seeds)Neither path is reached during steady-state operation when the writer is already a seed in the writer hostgroup. The monitoring loop (
aws_aurora_replication_lag_action()) sees the writer in HG10, considers it correct, and never triggers either function.Workaround
Pre-seed the writer in both hostgroups:
This is fragile because the writer identity can change after a failover, requiring the seed configuration to be updated.
Suggested Fix
After the initial autodiscovery cycle identifies the writer, check if
writer_is_also_reader > 0and the writer is missing from thereader_hostgroup. If so, call the insertion logic fromupdate_aws_aurora_set_writer()orcreate_new_server_in_hg()to add the writer to the reader hostgroup.This could be done in
evaluate_aws_aurora_results()after the topology is determined, or as an additional check inaws_aurora_replication_lag_action().Impact
writer_is_also_reader=1was specifically configured to prevent thisSteps to Reproduce
mysql_aws_aurora_hostgroupswithwriter_is_also_reader=1mysql_serversLOAD MYSQL SERVERS TO RUNTIMEruntime_mysql_servers— writer will only be in the writer hostgroup