Skip to content

hmon: add heartbeat monitor#67

Open
arkjedrz wants to merge 1 commit intoeclipse-score:mainfrom
qorix-group:arkjedrz_heartbeat-monitor
Open

hmon: add heartbeat monitor#67
arkjedrz wants to merge 1 commit intoeclipse-score:mainfrom
qorix-group:arkjedrz_heartbeat-monitor

Conversation

@arkjedrz
Copy link
Contributor

@arkjedrz arkjedrz commented Feb 9, 2026

Add heartbeat monitor HMON.

Resolves #68

@arkjedrz arkjedrz requested a review from pawelrutkaq February 9, 2026 15:43
@arkjedrz arkjedrz requested a review from Copilot February 9, 2026 15:43
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

License Check Results

🚀 The license check job ran with the Bazel command:

bazel run //:license-check

Status: ⚠️ Needs Review

Click to expand output
[License Check Output]
Extracting Bazel installation...
Starting local Bazel server (8.4.2) and connecting to it...
INFO: Invocation ID: 4e4f3c79-4cd8-4b5a-945e-796f786d15b2
Computing main repo mapping: 
Computing main repo mapping: 
WARNING: For repository 'score_rust_policies', the root module requires module version score_rust_policies@0.0.3, but got score_rust_policies@0.0.5 in the resolved dependency graph. Please update the version in your MODULE.bazel or set --check_direct_dependencies=off
Computing main repo mapping: 
Loading: 
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)

Analyzing: target //:license-check (44 packages loaded, 9 targets configured)

Analyzing: target //:license-check (109 packages loaded, 47 targets configured)

Analyzing: target //:license-check (148 packages loaded, 5209 targets configured)

Analyzing: target //:license-check (148 packages loaded, 5805 targets configured)

Analyzing: target //:license-check (159 packages loaded, 7809 targets configured)

Analyzing: target //:license-check (160 packages loaded, 7817 targets configured)

Analyzing: target //:license-check (160 packages loaded, 7817 targets configured)

Analyzing: target //:license-check (160 packages loaded, 7817 targets configured)

Analyzing: target //:license-check (163 packages loaded, 9705 targets configured)

Analyzing: target //:license-check (163 packages loaded, 9705 targets configured)

Analyzing: target //:license-check (164 packages loaded, 9829 targets configured)

INFO: Analyzed target //:license-check (165 packages loaded, 9955 targets configured).
[14 / 16] [Prepa] JavaToolchainCompileBootClasspath external/rules_java+/toolchains/platformclasspath.jar
[15 / 16] Building license.check.license_check.jar (); 0s disk-cache, multiplex-worker
INFO: Found 1 target...
Target //:license.check.license_check up-to-date:
  bazel-bin/license.check.license_check
  bazel-bin/license.check.license_check.jar
INFO: Elapsed time: 25.236s, Critical Path: 2.82s
INFO: 16 processes: 12 internal, 3 processwrapper-sandbox, 1 worker.
INFO: Build completed successfully, 16 total actions
INFO: Running command line: bazel-bin/license.check.license_check ./formatted.txt <args omitted>
usage: org.eclipse.dash.licenses.cli.Main [-batch <int>] [-cd <url>]
       [-confidence <int>] [-ef <url>] [-excludeSources <sources>] [-help] [-lic
       <url>] [-project <shortname>] [-repo <url>] [-review] [-summary <file>]
       [-timeout <seconds>] [-token <token>]

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

The created documentation from the pull request is available at: docu-html

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new heartbeat monitor (HMON) to the Rust health monitoring library and integrates it into the existing monitoring worker/supervisor notification flow.

Changes:

  • Introduces heartbeat module (monitor + atomic state) and integrates heartbeat monitors into HealthMonitorBuilder/HealthMonitor.
  • Updates the monitor evaluation interface to accept a shared hmon_starting_point and wires it through the monitoring worker thread.
  • Refactors SupervisorAPIClient into a dedicated module with selectable implementations via Cargo features.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/health_monitoring_lib/rust/worker.rs Passes a shared HMON start instant into monitor evaluations; moves supervisor client trait out.
src/health_monitoring_lib/rust/supervisor_api_client/mod.rs Adds feature-selected SupervisorAPIClient + implementation alias.
src/health_monitoring_lib/rust/supervisor_api_client/stub_supervisor_api_client.rs New stub client implementation.
src/health_monitoring_lib/rust/supervisor_api_client/score_supervisor_api_client.rs New SCORE client implementation.
src/health_monitoring_lib/rust/lib.rs Adds heartbeat monitors to builder + start flow; uses new supervisor client impl selector.
src/health_monitoring_lib/rust/heartbeat/mod.rs Exposes heartbeat monitor API.
src/health_monitoring_lib/rust/heartbeat/heartbeat_state.rs Adds atomic packed heartbeat state and tests.
src/health_monitoring_lib/rust/heartbeat/heartbeat_monitor.rs Implements heartbeat monitor logic + tests (incl. loom).
src/health_monitoring_lib/rust/deadline/deadline_monitor.rs Adapts deadline evaluation to new evaluator signature and shared start instant.
src/health_monitoring_lib/rust/common.rs Extends evaluation error types; adds duration_to_u32; updates evaluator trait signature.
src/health_monitoring_lib/Cargo.toml Adds optional monitor_rs, loom target dep, and feature defaults.
src/health_monitoring_lib/BUILD Enables score_supervisor_api_client feature in Bazel builds.
Cargo.toml Updates workspace defaults and adds cfg(loom) lint configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from f07c0db to e65f6a6 Compare February 10, 2026 11:43
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 11:43 — with GitHub Actions Inactive
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 11:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from e65f6a6 to bd438c3 Compare February 10, 2026 12:57
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 12:57 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from bd438c3 to c5f4b84 Compare February 10, 2026 13:14
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 13:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from c5f4b84 to b012f37 Compare February 10, 2026 13:19
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 13:19 — with GitHub Actions Inactive
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 13:19
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from b012f37 to 862da21 Compare February 11, 2026 12:14
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 11, 2026 12:14 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 862da21 to 59c92ee Compare February 13, 2026 14:48
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 13, 2026 14:48 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 13, 2026 14:48 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 59c92ee to cf14efb Compare February 17, 2026 12:12
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 17, 2026 12:12 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 17, 2026 12:12 — with GitHub Actions Inactive
@arkjedrz arkjedrz self-assigned this Feb 17, 2026
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from cf14efb to 67fe6cc Compare February 25, 2026 09:30
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 25, 2026 13:08 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 806a239 to 50a33ae Compare February 26, 2026 07:13
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 50a33ae to 3365dc5 Compare February 26, 2026 10:42
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 10:42 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 10:42 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 3365dc5 to a6a4166 Compare February 26, 2026 12:39
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 12:39 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 12:39 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from a6a4166 to a1e6154 Compare February 26, 2026 13:57
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 13:57 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 26, 2026 13:57 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from a1e6154 to 656c5fe Compare March 2, 2026 12:21
@arkjedrz arkjedrz had a problem deploying to workflow-approval March 2, 2026 12:21 — with GitHub Actions Failure
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 656c5fe to 0014124 Compare March 2, 2026 14:12
@arkjedrz arkjedrz marked this pull request as ready for review March 2, 2026 14:13
@arkjedrz arkjedrz had a problem deploying to workflow-approval March 2, 2026 14:13 — with GitHub Actions Failure
@arkjedrz arkjedrz requested a review from Copilot March 2, 2026 14:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 18 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


/// Create range with values offset by timestamp.
fn offset(&self, timestamp: u32) -> Self {
Self::new(self.min + timestamp, self.max + timestamp)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InternalRange::offset adds timestamp to min/max with unchecked u32 arithmetic. This can wrap silently in release builds near the u32 limit, causing wrong range comparisons. Use checked_add (and handle overflow explicitly) rather than +.

Suggested change
Self::new(self.min + timestamp, self.max + timestamp)
let min = self
.min
.checked_add(timestamp)
.expect("overflow when computing min offset in InternalRange::offset");
let max = self
.max
.checked_add(timestamp)
.expect("overflow when computing max offset in InternalRange::offset");
Self::new(min, max)

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +72
assert!(value < 1 << 29, "provided heartbeat offset is out of range");
self.0 = ((value as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_heartbeat_timestamp_offset uses an unconditional assert! on the 29-bit limit. Since this can be reached from normal runtime behavior (e.g., long time between heartbeats), consider returning a Result/saturating instead of panicking, or ensure all callers validate inputs so this cannot trigger in production.

Suggested change
assert!(value < 1 << 29, "provided heartbeat offset is out of range");
self.0 = ((value as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK);
let clamped = min(value, (1 << 29) - 1);
self.0 = ((clamped as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK);

Copilot uses AI. Check for mistakes.
Comment on lines +664 to +668
let hmon_starting_point = Instant::now();

// Wait until in range.
sleep_until(Duration::from_millis(50), hmon_starting_point);

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These loom tests rely on wall-clock time (Instant::now(), elapsed()) and real sleeping (sleep_until -> std::thread::sleep) inside loom::model. This is typically non-deterministic and can make the model exploration extremely slow/flaky. Prefer loom-friendly tests that avoid real time (e.g., drive state transitions directly or gate out time-based tests from the loom configuration).

Suggested change
let hmon_starting_point = Instant::now();
// Wait until in range.
sleep_until(Duration::from_millis(50), hmon_starting_point);
// Simulate that 50 ms have already elapsed since the monitor starting point.
let hmon_starting_point = Instant::now() - Duration::from_millis(50);

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +128
#[should_panic(expected = "HMON starting point is earlier than monitor starting point")]
fn hmon_time_offset_wrong_order() {
let hmon_starting_point = Instant::now();
let monitor_starting_point = Instant::now();
let _offset = hmon_time_offset(hmon_starting_point, monitor_starting_point);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmon_time_offset_wrong_order relies on two consecutive Instant::now() calls producing strictly increasing values. On platforms with coarse timer resolution the instants can be equal, so checked_duration_since returns Some(0) and the test won’t panic (flaky test). Make the ordering deterministic by deriving one instant from the other (e.g., monitor_starting_point = hmon_starting_point.checked_add(...) or vice versa).

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +73
// Check range is valid.
let range_min_ms = self.range.min.as_millis() as u64;
let internal_processing_cycle_ms = internal_processing_cycle.as_millis() as u64;
if range_min_ms * 2 <= internal_processing_cycle_ms {
error!(
"Internal processing cycle duration ({} ms) must be longer than two shortest allowed ranges ({} ms).",
internal_processing_cycle_ms, range_min_ms
);
return Err(HealthMonitorError::InvalidArgument);
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HeartbeatStateSnapshot can only store a 29-bit heartbeat offset (max ~6.2 days in ms) and will panic if a heartbeat arrives after that (set_heartbeat_timestamp_offset asserts). HeartbeatMonitorBuilder::build should validate that the configured range (and any expected max time between heartbeats) fits this representation, and return InvalidArgument instead of allowing a runtime panic later.

Copilot uses AI. Check for mistakes.
fn evaluate(&self, hmon_starting_point: Instant, on_error: &mut dyn FnMut(&MonitorTag, MonitorEvaluationError)) {
// Get current timestamp, with offset to HMON time.
let offset = hmon_time_offset(hmon_starting_point, self.monitor_starting_point);
let now = offset + duration_to_u32(hmon_starting_point.elapsed());
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now is computed as offset + duration_to_u32(hmon_starting_point.elapsed()) using u32 addition. In release builds this can wrap silently once elapsed > u32::MAX - offset (i.e., earlier than the 49-day limit if offset != 0), leading to incorrect evaluations. Use checked_add (and fail/return an error) to avoid wraparound.

Suggested change
let now = offset + duration_to_u32(hmon_starting_point.elapsed());
let elapsed = duration_to_u32(hmon_starting_point.elapsed());
let now = match offset.checked_add(elapsed) {
Some(value) => value,
None => {
error!("Overflow while computing current timestamp in HeartbeatMonitorInner::evaluate");
return;
}
};

Copilot uses AI. Check for mistakes.
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 0014124 to a8b11fc Compare March 3, 2026 10:16
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 10:16 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 10:16 — with GitHub Actions Inactive
Add heartbeat monitor HMON.
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from a8b11fc to ef78463 Compare March 3, 2026 12:39
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 12:39 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 12:39 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[HmLib] Rust Heartbeat Monitor API

2 participants