feat(dovecot-charm): add HA support with SSH key exchange and force-sync action#15
feat(dovecot-charm): add HA support with SSH key exchange and force-sync action#15alithethird wants to merge 41 commits intomainfrom
Conversation
- Updated unit tests for DovecotCharm to use new setup methods for Dovecot and Procmail. - Replaced calls to _systemctl with _setup_dovecot and _setup_procmail in certificate tests. - Ensured service reload is properly mocked and verified in tests. - Added new dependency on charmlibs-interfaces-tls-certificates version 1.8.1 in uv.lock.
When no certificates relation is active the TLS cert files do not exist yet. Unconditionally setting ssl=required in the dovecot config caused dovecot to fail to start, putting the unit in error on every upgrade-charm or install event before a cert is provisioned. Template now checks tls_enabled (passed from charm, true iff the .pem file exists in /etc/dovecot/private/) and falls back to ssl=yes when no cert is present. ssl=required is restored automatically on the next _reconcile after certificate_available writes the cert to disk. Also register --use-existing pytest option in tests/conftest.py so it can be passed on the command line without an 'unrecognised arguments' error.
- Replace _on_certificate_available with _setup_tls called from _reconcile - Wire certificate_available event to _reconcile (not separate handler) - Always ssl=required in dovecot.conf (no conditional) - Add TLS_CERT_DIR constant, use get_assigned_certificate() API - Charm blocks with distinct messages for missing relation vs missing cert - New tests/unit/test_tls.py with 6 tests following SKILL.md principles - Fix integration tests: copyright, sleeps, stat quoting, ssl assertion - Update state diagrams for TLS states and events
…nt deployment logic
There was a problem hiding this comment.
license-eye has checked 117 files.
| Valid | Invalid | Ignored | Fixed |
|---|---|---|---|
| 41 | 1 | 75 | 0 |
Click to see the invalid file list
- docs/release-notes/artifacts/pr-4-ha.yaml
Use this command to fix any missing license headers
```bash
docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix
</details>
…redundant imports
- Move all HA setup (SSH keys, authorized_keys, sync script, cronjob) into _reconcile so they're re-evaluated on every event, not just install - Replace os.system calls with subprocess.run and systemd.service_reload - Replace sed-based sshd_config mutation with pure Python - Guard _sync_authorized_keys against missing peer relation - Skip sync script/cronjob when secondary hostname is not yet known - Remove dead code (sync_smtp_aliases_target) - Remove _on_replicas_changed handler — folded into _reconcile - Fix _setup_ssh_keys to handle keygen failure gracefully - Rewrite unit tests per SKILL.md principles: assert on observable state (unit_status, opened_ports), comment every patch - Add HA patches to storage and TLS tests broken by holistic reconcile - Fix integration test copyright year and force-sync empty-maildir bug
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…on template logging
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ate cron schedule
…allowed characters
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…and update related tests
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| authorized_keys = [] | ||
| peer_ips: list[str] = [] | ||
| for unit in relation.units: | ||
| pk = relation.data[unit].get("public_key") | ||
| if pk: | ||
| authorized_keys.append(pk) | ||
| ip = relation.data[unit].get("ip_address") | ||
| if ip: | ||
| peer_ips.append(ip) | ||
|
|
||
| our_pk = relation.data[charm.unit].get("public_key") | ||
| if our_pk: | ||
| authorized_keys.append(our_pk) | ||
| our_ip = relation.data[charm.unit].get("ip_address") | ||
| if our_ip: | ||
| peer_ips.append(our_ip) | ||
|
|
||
| if not authorized_keys: | ||
| return | ||
|
|
||
| auth_file = SSH_DIR / "authorized_keys" | ||
| auth_file.write_text("\n".join(authorized_keys) + "\n") | ||
| auth_file.chmod(0o600) | ||
|
|
||
| ensure_root_ssh_login(peer_ips) | ||
|
|
There was a problem hiding this comment.
sync_authorized_keys appends this unit’s own ip_address/public_key to the lists that are later used to restrict root SSH (ensure_root_ssh_login(peer_ips)). This means the sshd drop-in will be written even when there are no remote peers yet, and it will allow root SSH from the unit’s own address (which isn’t a peer). Consider passing only remote peer IPs (e.g., from relation.units) to ensure_root_ssh_login, and only enabling the drop-in once at least one remote peer IP is present.
| def _secondary_hostname(self) -> typing.Optional[str]: | ||
| """Return the hostname of the first remote peer unit, or None.""" | ||
| relation = self.model.get_relation(PEER_RELATION_NAME) | ||
| if not relation: | ||
| return None | ||
| for unit in relation.units: | ||
| hostname = relation.data[unit].get("hostname") | ||
| if hostname: | ||
| return hostname | ||
| return None |
There was a problem hiding this comment.
_secondary_hostname reads the peer's hostname field (populated via socket.gethostname() in ha.setup_ssh_keys). That hostname is not guaranteed to be resolvable from the other unit, while you already publish ip_address on the peer relation. Consider switching this property (and the sync script template context) to use the peer’s published ip_address (or another Juju-resolvable address) and keep sync_known_hosts in sync with whatever identifier SSH will actually connect to.
Summary
force-syncaction to trigger immediate mail pool synchronisation from primary to secondary on demandpr/3-tlsbranch (resolved conflicts from TLS refactor)Changes
charm.py: adds_setup_ssh_keys,_on_replicas_changed,_ensure_root_ssh_configs,_install_mail_sync_script,_setup_mail_sync_cronjob,_on_force_sync,_secondary_hostname, and_is_primarycharmcraft.yaml: addsforce-syncactiontemplates/: addssync-to-secondary.sh.tmplandsync-to-secondary_cron.tmpltests/unit/test_charm.py: adds unit tests for HA functionalitytests/integration/test_ha.py: adds integration test for HA failoverdocs/: adds release notes for pr/4-ha