Skip to content

Implement libvirt HA/DR/FT engine with full failover/failback validation#20

Merged
dhslove merged 81 commits intoablecloud-team:mainfrom
dhslove:feature/vm-ha-dr-ft
Apr 17, 2026
Merged

Implement libvirt HA/DR/FT engine with full failover/failback validation#20
dhslove merged 81 commits intoablecloud-team:mainfrom
dhslove:feature/vm-ha-dr-ft

Conversation

@dhslove
Copy link
Copy Markdown
Contributor

@dhslove dhslove commented Apr 17, 2026

Summary

This PR introduces the libvirt-based ftctl engine for HA/DR/FT orchestration and carries it through real-environment validation.

Key outcomes:

  • HA full failover and full failback implemented and validated
  • DR full failover and full failback implemented and validated
  • FT file-based full failover and full failback implemented and validated
  • FT block-backed full failover and full failback implemented and validated
  • Packaging, install, spec, completion, and build workflows aligned for release
  • Validation logs and operational docs updated to match final implementation state

Major implementation areas

HA / DR

  • Added remote-NBD based protect paths
  • Implemented full failback for HA/DR with reverse sync and cutback completion
  • Added IPMI fencing support and validation

FT file-based

  • Implemented full failback
  • Added preflight size validation for prebuilt file-based FT pairs
  • Tightened x-colo runtime verification to avoid false-positive colo_running

FT block-backed

  • Implemented cold conversion protect flow
  • Implemented cold-cutback failback flow
  • Added explicit block-backed size mismatch handling

Validation and docs

  • Updated failover/failback documentation and runbook
  • Updated profile schema to reflect actual supported modes
  • Updated test execution log and test-id breakdown
  • Reclassified OP-ST-01 under shared-outage handling criteria after final agreement

Validation summary

Validated in real environments:

  • HA failover/failback: PASS
  • DR failover/failback: PASS
  • FT file-based failover/failback: PASS
  • FT block-backed failover/failback: PASS
  • ST/LV operational scenarios documented with final outcomes

GitHub Actions:

  • release build: PASS
  • WinPE release build: PASS

Notes

  • README.md and INSTALL.md were intentionally left out of this PR as non-blocking follow-up documentation work.
  • OP-ST-01 is evaluated as shared storage total outage handling, not host-local path degradation.

dhslove added 30 commits March 27, 2026 18:48
dhslove added 29 commits April 11, 2026 19:29
@dhslove dhslove merged commit ebe1c2a into ablecloud-team:main Apr 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant