Skip to content

mysql-k8s r400: logging-relation-broken hook fails with KeyError 'logs_synced' on scale-down #202

@dmvdm

Description

@dmvdm

Note: This issue was generated with AI assistance (GitHub Copilot) based on automated log analysis and triage.
Filed by @canonical/solutions-qa

Summary

mysql-k8s charm revision 400 fails scale-down operations due to an unhandled KeyError in the logging-relation-broken hook handler.

Root Cause

The _cos_relation_broken() handler in /src/log_rotation_setup.py at line 116 attempts to delete a dictionary key that doesn't always exist:

del self.charm.unit_peer_data["logs_synced"]  # ← KeyError when key doesn't exist

Exception Traceback:

File "/var/lib/juju/agents/unit-target-0/charm/src/log_rotation_setup.py", line 116, in _cos_relation_broken
  del self.charm.unit_peer_data["logs_synced"]
KeyError: 'logs_synced'

Impact

  • Scale-down operations fail - units cannot be removed
  • Unit enters error state: target/0* error idle
  • Juju continuously retries the failing hook (10+ retries observed)
  • Integration tests timeout after 10 minutes waiting for unit removal
  • Blocks deployment and testing of mysql-k8s charm

Test Failure Details

  • Failed Test: test_scale_in_and_scale_out_charm
  • Execution IDs: 443474, 443472 (consistent failure)
  • Charm: mysql-k8s
  • Revision: 400
  • Channel: 8.0/candidate
  • Failure Rate: 100% (regression in this revision)
  • Error: JujuWaitTimeoutError: Timed out while waiting for unit removal

Evidence from Juju Debug Logs

Hook Execution Failure (repeats 10+ times):

unit-target-0: 2026-03-27 11:41:49 ERROR juju-log logging:5: Uncaught exception while in charm code:
unit-target-0: 2026-03-27 11:41:49 Traceback (most recent call last):
unit-target-0:   File "/var/lib/juju/agents/unit-target-0/charm/venv/lib/python3.10/site-packages/ops/framework.py", line 1030, in _reemit
unit-target-0:     custom_handler(event)
unit-target-0:   File "/var/lib/juju/agents/unit-target-0/charm/src/log_rotation_setup.py", line 116, in _cos_relation_broken
unit-target-0:     del self.charm.unit_peer_data["logs_synced"]
unit-target-0: KeyError: 'logs_synced'
unit-target-0: 2026-03-27 11:41:50 ERROR juju.worker.uniter.operation hook "logging-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
unit-target-0: 2026-03-27 11:41:50 INFO juju.worker.uniter resolver.go:180 awaiting error resolution for "relation-broken" hook

Unit Status at Failure:

App: target (mysql-k8s)
Unit: target/0
Status: error idle
Message: hook failed: "logging-relation-broken" for neighbor:logging

Regression Analysis

  • Revision 400 (current): ✗ FAILS (100% failure rate)
  • Revision 385 (previous): ✓ PASSES (test history shows passing)
  • Conclusion: Bug introduced in revision 400

Reproduction Steps

  1. Deploy mysql-k8s charm (revision 400, channel 8.0/candidate)
  2. Deploy charm with logging integration (e.g., loki-k8s)
  3. Create logging relation: mysql-k8s:logging loki-k8s:logging
  4. Scale down mysql-k8s from 1 to 0 units
  5. Expected: Unit removed cleanly
  6. Actual: logging-relation-broken hook raises KeyError, unit enters error state

Recommended Fix

Use defensive dictionary access instead of unconditional delete:

Option 1 (Recommended - cleaner):

# In _cos_relation_broken() handler
self.charm.unit_peer_data.pop("logs_synced", None)

Option 2 (Guard clause):

# In _cos_relation_broken() handler
if "logs_synced" in self.charm.unit_peer_data:
    del self.charm.unit_peer_data["logs_synced"]

The root issue is that the code assumes logs_synced key exists when it may not have been set in all scenarios (e.g., if the logging relation was broken before logs_synced was ever written).

Test Observer Link

View the failure with complete juju logs:
https://test-observer.canonical.com/#/charms/406078?testExecutionId=443474&testResultId=10110772

Related Files

  • Source: src/log_rotation_setup.py (line 116)
  • Test: charm-integration-testing/test_scale_in_and_scale_out_charm
  • Charm: canonical/mysql-operators (mysql-k8s package)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working as expected

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions