Skip to content

feat(intergral/alerting): HA partitioning#57

Draft
jhawksley-intergral wants to merge 10 commits into12.3.x-intergralfrom
feat/alerting-partitioner
Draft

feat(intergral/alerting): HA partitioning#57
jhawksley-intergral wants to merge 10 commits into12.3.x-intergralfrom
feat/alerting-partitioner

Conversation

@jhawksley-intergral
Copy link

What is this feature?

[Add a brief description of what the feature or update does.]

Why do we need this feature?

[Add a description of the problem the feature is trying to solve.]

Who is this feature for?

[Add information on what kind of user the feature is for.]

Which issue(s) does this PR fix?:

Fixes #

Special notes for your reviewer:

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

- Add dynamicMockPeer with mutable membership for topology change simulation
- Add unit tests: member joins, member leaves, healthy↔unhealthy transitions, rapid changes
- Add fetcher integration tests: topology change triggers re-fetch, stable topology skips
- Add withPartitioner option to setupScheduler for injecting partitioner in tests
- Fix docker_build.yml: sanitize ref_name for valid Docker tags (replace / with -)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove extra field alignment spaces in SchedulerCfg (goimports)
- Remove ineffectual assignment to peers in topology-change test (ineffassign)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comment to docker_build.yml to trigger a fresh CI build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ed rules

- Add RuleFilter interface to state package, satisfied by schedule.RulePartitioner
- Add refreshRemoteStates() to periodically load DB state for non-local rules into cache
- Start background sync loop in Manager.Run() when partitioning is enabled
- Add ha_scheduler_remote_state_sync_interval config option (default 30s)
- Wire partitioner as RuleFilter in ngalert.go
- Add tests for remote refresh, stale cleanup, topology change, and context cancellation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add package doc comment to state manager

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Skip orgs with no alert rules in refreshRemoteStates loop
- Downgrade partitioner "HA partitioning applied" log from Info to Debug

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…te rules

When HA partitioning is enabled, the scheduler only knows about locally
assigned rules. API requests hitting a non-owning instance returned blank
LastEvaluation because the scheduler had no status for remote rules.

Now falls back to StatesToRuleStatus() using cached state from the remote
state sync, so the UI shows correct "Next Evaluation" regardless of which
instance serves the request.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant