hmon: add C++ interface to logic monitor#98
hmon: add C++ interface to logic monitor#98arkjedrz wants to merge 4 commits intoeclipse-score:mainfrom
Conversation
License Check Results🚀 The license check job ran with the Bazel command: bazel run //:license-checkStatus: Click to expand output |
|
The created documentation from the pull request is available at: docu-html |
There was a problem hiding this comment.
Pull request overview
This PR extends the health monitoring library with new monitor types (logic + heartbeat) and exposes them through the C++ API/FFI, while refactoring the Rust worker/evaluation plumbing and supervisor client selection to support multiple monitor kinds.
Changes:
- Add Rust
LogicMonitor+HeartbeatMonitorimplementations and integrate them intoHealthMonitorBuilder/HealthMonitor(including evaluation + error typing). - Add Rust FFI bindings and C++ wrappers/headers for heartbeat + logic monitors, and update the C++ health monitor API/tests accordingly.
- Refactor supervisor API client into a dedicated Rust module with feature-gated implementations and update the worker loop to pass a shared starting point into monitor evaluation.
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/health_monitoring_lib/rust/worker.rs | Refactors evaluation loop for new error variants and passes hmon_starting_point into monitor evaluation; moves supervisor client trait out. |
| src/health_monitoring_lib/rust/tag.rs | Adds StateTag newtype for logic monitor state tagging and updates tag tests. |
| src/health_monitoring_lib/rust/supervisor_api_client/mod.rs | Introduces shared SupervisorAPIClient trait and feature-gated implementation modules. |
| src/health_monitoring_lib/rust/supervisor_api_client/stub_supervisor_api_client.rs | Adds stub supervisor client implementation. |
| src/health_monitoring_lib/rust/supervisor_api_client/score_supervisor_api_client.rs | Adds score supervisor client implementation (monitor_rs-based). |
| src/health_monitoring_lib/rust/logic/mod.rs | Adds logic monitor module exports and FFI module. |
| src/health_monitoring_lib/rust/logic/logic_monitor.rs | Implements LogicMonitor, builder, evaluation, and unit tests. |
| src/health_monitoring_lib/rust/logic/ffi.rs | Adds FFI API + tests for logic monitor builder/monitor operations. |
| src/health_monitoring_lib/rust/lib.rs | Integrates deadline/heartbeat/logic monitors into builder/health monitor, adds HealthMonitorError, and changes build()/start() to return Result. |
| src/health_monitoring_lib/rust/heartbeat/mod.rs | Adds heartbeat monitor module exports and FFI module. |
| src/health_monitoring_lib/rust/heartbeat/heartbeat_state.rs | Adds atomic packed heartbeat state representation + tests (incl. loom gating). |
| src/health_monitoring_lib/rust/heartbeat/heartbeat_monitor.rs | Implements HeartbeatMonitor, builder validation, evaluation logic, and extensive tests (incl. loom). |
| src/health_monitoring_lib/rust/heartbeat/ffi.rs | Adds FFI API + tests for heartbeat monitor builder and heartbeat calls. |
| src/health_monitoring_lib/rust/ffi.rs | Extends core FFI surface to add heartbeat/logic monitor support and map HealthMonitorError to FFICode. |
| src/health_monitoring_lib/rust/deadline/mod.rs | Re-exports DeadlineEvaluationError for typed evaluation errors. |
| src/health_monitoring_lib/rust/deadline/deadline_monitor.rs | Refactors deadline monitor to use HasEvalHandle, typed evaluation errors, and updated evaluator signature. |
| src/health_monitoring_lib/cpp/tests/health_monitor_test.cpp | Extends C++ integration test to construct/get/use heartbeat and logic monitors. |
| src/health_monitoring_lib/cpp/logic_monitor.cpp | Adds C++ wrapper implementation for logic monitor builder/monitor. |
| src/health_monitoring_lib/cpp/include/score/hm/tag.h | Adds default Tag ctor and introduces StateTag. |
| src/health_monitoring_lib/cpp/include/score/hm/logic/logic_monitor.h | Adds C++ public API for logic monitor builder/monitor. |
| src/health_monitoring_lib/cpp/include/score/hm/heartbeat/heartbeat_monitor.h | Adds C++ public API for heartbeat monitor builder/monitor. |
| src/health_monitoring_lib/cpp/include/score/hm/health_monitor.h | Extends C++ health monitor API to add/get heartbeat and logic monitors. |
| src/health_monitoring_lib/cpp/heartbeat_monitor.cpp | Adds C++ wrapper implementation for heartbeat monitor builder/monitor. |
| src/health_monitoring_lib/cpp/health_monitor.cpp | Extends C++ health monitor wrapper to add builder/get APIs for heartbeat + logic monitors. |
| src/health_monitoring_lib/Cargo.toml | Makes monitor_rs optional, adds loom target dep, and updates feature defaults. |
| src/health_monitoring_lib/BUILD | Adds new C++ sources/headers; adjusts Rust crate feature flags for Bazel targets/tests. |
| examples/rust_supervised_app/src/main.rs | Updates example to handle new Result returns for build() and start(). |
| Cargo.toml | Sets workspace default-members and configures unexpected_cfgs for cfg(loom). |
| Cargo.lock | Updates lockfile with new dependencies (loom + transitive crates). |
Comments suppressed due to low confidence (1)
src/health_monitoring_lib/cpp/include/score/hm/tag.h:39
Tagstoresdata_asconst char* const, which makes the pointer itself immutable after construction. Any FFI API that writes aStateTag/Taginto an out-parameter (e.g., aStateTag*passed to Rust) would end up mutating aconstsubobject, which is undefined behavior in C++. If tags are intended to be written/returned via out-params, change the field toconst char* data_(pointer-to-const data, but not const pointer) so the struct remains trivially writable.
/// Common string-based tag.
class Tag
{
public:
/// Create an empty tag.
Tag() : data_{nullptr}, length_{0} {}
/// Create a new tag from a C-style string.
template <size_t N>
explicit Tag(const char (&tag)[N]) : data_(tag), length_(N - 1)
{
}
private:
/// SAFETY: This has to be FFI compatible with the Rust side representation.
const char* const data_;
size_t length_;
};
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[no_mangle] | ||
| pub extern "C" fn heartbeat_monitor_builder_create( | ||
| range_min_ms: u32, | ||
| range_max_ms: u32, | ||
| heartbeat_monitor_builder_handle_out: *mut FFIHandle, | ||
| ) -> FFICode { | ||
| if heartbeat_monitor_builder_handle_out.is_null() { | ||
| return FFICode::NullParameter; | ||
| } | ||
|
|
||
| let range_min = Duration::from_millis(range_min_ms as u64); | ||
| let range_max = Duration::from_millis(range_max_ms as u64); | ||
| let range = TimeRange::new(range_min, range_max); | ||
|
|
There was a problem hiding this comment.
heartbeat_monitor_builder_create constructs TimeRange::new(range_min, range_max), which asserts min <= max and will panic across the FFI boundary if the caller passes an invalid range. For consistency with deadline_monitor_builder_add_deadline, validate range_min_ms <= range_max_ms here and return FFICode::InvalidArgument instead of panicking.
| FFICode logic_monitor_destroy(FFIHandle logic_monitor_handle); | ||
| FFICode logic_monitor_transition(FFIHandle logic_monitor_handle, const StateTag* to_state); | ||
| FFICode logic_monitor_state(FFIHandle logic_monitor_handle, StateTag* state_out); |
There was a problem hiding this comment.
The declared FFI signature for logic_monitor_state uses StateTag* state_out, but the Rust implementation currently expects a pointer-to-pointer (*mut *const StateTag) and writes a pointer value. This mismatch will lead to memory corruption at runtime. Align the C++ declaration with the Rust API (or, preferably, update Rust to take StateTag* and write the value in-place).
| // Check cycle values. | ||
| // `supervisor_api_cycle` must be a multiple of `internal_processing_cycle`. | ||
| let supervisor_api_cycle_ms = self.supervisor_api_cycle.as_millis() as u64; | ||
| let internal_processing_cycle_ms = self.internal_processing_cycle.as_millis() as u64; |
There was a problem hiding this comment.
internal_processing_cycle_ms can become 0 for sub-millisecond durations (or if explicitly set to 0). Calling is_multiple_of(0) will panic (division by zero), which defeats the goal of returning HealthMonitorError::InvalidArgument. Add an explicit check that internal_processing_cycle_ms != 0 (and likely supervisor_api_cycle_ms != 0) before the multiple-of test, or perform the check using a non-truncating unit (e.g., nanos) to avoid 0ms truncation.
| let internal_processing_cycle_ms = self.internal_processing_cycle.as_millis() as u64; | |
| let internal_processing_cycle_ms = self.internal_processing_cycle.as_millis() as u64; | |
| // Guard against zero or sub-millisecond durations that truncate to 0 ms. | |
| if internal_processing_cycle_ms == 0 || supervisor_api_cycle_ms == 0 { | |
| error!( | |
| "Supervisor API cycle duration ({} ms) and internal processing cycle interval ({} ms) must both be non-zero.", | |
| supervisor_api_cycle_ms, internal_processing_cycle_ms | |
| ); | |
| return Err(HealthMonitorError::InvalidArgument); | |
| } |
| let internal_processing_cycle_ms = internal_processing_cycle.as_millis() as u64; | ||
| if range_min_ms * 2 <= internal_processing_cycle_ms { | ||
| error!( | ||
| "Internal processing cycle duration ({} ms) must be longer than two shortest allowed ranges ({} ms).", |
There was a problem hiding this comment.
The validation condition checks range_min_ms * 2 <= internal_processing_cycle_ms (i.e., rejects when the internal cycle is too large), but the error message says the internal processing cycle "must be longer" than the shortest ranges. The message should describe the actual constraint being enforced (e.g., that the internal cycle must be sufficiently short relative to the minimum allowed heartbeat interval).
| "Internal processing cycle duration ({} ms) must be longer than two shortest allowed ranges ({} ms).", | |
| "Internal processing cycle duration ({} ms) must be shorter than twice the shortest allowed heartbeat interval (shortest: {} ms).", |
| fn collect_given_monitors<Monitor>( | ||
| monitors_to_collect: &mut HashMap<MonitorTag, MonitorContainer<Monitor>>, | ||
| collected_monitors: &mut FixedCapacityVec<MonitorEvalHandle>, | ||
| ) -> Result<(), HealthMonitorError> { | ||
| for (tag, monitor) in monitors_to_collect.iter_mut() { | ||
| match monitor.take() { | ||
| Some(DeadlineMonitorState::Taken(handle)) => { | ||
| if monitors.push(handle).is_err() { | ||
| // Should not fail since we preallocated enough capacity | ||
| return Err("Failed to push monitor handle".to_string()); | ||
| Some(MonitorState::Taken(handle)) => { | ||
| if collected_monitors.push(handle).is_err() { | ||
| // Should not fail - capacity was preallocated. | ||
| error!("Failed to push monitor handle."); | ||
| return Err(HealthMonitorError::WrongState); | ||
| } | ||
| }, | ||
| Some(DeadlineMonitorState::Available(_)) => { | ||
| return Err(format!( | ||
| Some(MonitorState::Available(_)) => { | ||
| error!( | ||
| "All monitors must be taken before starting HealthMonitor but {:?} is not taken.", | ||
| tag | ||
| )); | ||
| ); | ||
| return Err(HealthMonitorError::WrongState); | ||
| }, |
There was a problem hiding this comment.
collect_given_monitors calls monitor.take() and does not restore the entry on error paths (e.g., when it encounters an Available(_) monitor). This mutates the HealthMonitor into a broken state after a failed start() attempt: monitors that were Available or Taken can be replaced by None, preventing a caller from taking the missing monitor and retrying start(). Consider validating without consuming (e.g., inspect as_ref() first) or collecting into a temporary list and restoring the original map state before returning an error.
| pub extern "C" fn logic_monitor_state(logic_monitor_handle: FFIHandle, state_out: *mut *const StateTag) -> FFICode { | ||
| if logic_monitor_handle.is_null() || state_out.is_null() { | ||
| return FFICode::NullParameter; | ||
| } | ||
|
|
||
| // SAFETY: | ||
| // Validity of this pointer is ensured. | ||
| // It is assumed that the pointer was created by a call to `health_monitor_get_logic_monitor`. | ||
| // It is assumed that the pointer was not consumed by a call to `logic_monitor_destroy`. | ||
| let monitor = FFIBorrowed::new(unsafe { Box::from_raw(logic_monitor_handle as *mut LogicMonitor) }); | ||
|
|
||
| match monitor.state() { | ||
| Ok(state) => { | ||
| unsafe { | ||
| *state_out = Box::into_raw(Box::new(state)); | ||
| } | ||
| FFICode::Success |
There was a problem hiding this comment.
logic_monitor_state uses an out-parameter type *mut *const StateTag and writes a heap-allocated StateTag pointer into it. This does not match the C++ wrapper, which passes StateTag* (a struct) and expects the state to be written in-place; with the current ABI this will corrupt the StateTag object. Prefer state_out: *mut StateTag and write the returned state directly into *state_out (no heap allocation), or alternatively keep the pointer-to-pointer API but then update the C++ declaration and add a corresponding destroy function for the allocated StateTag.
| unsafe impl Send for ScoreSupervisorAPIClient {} // Just assuming it's safe to send across threads, this is a temporary solution | ||
|
|
There was a problem hiding this comment.
unsafe impl Send for ScoreSupervisorAPIClient is justified only with a comment (“Just assuming it's safe…”). If monitor_rs::Monitor<Checks> is not actually Send, this is undefined behavior when the client is moved to the monitoring thread. Please remove the unsafe impl Send and instead require/provide a Send-safe supervisor link type, or wrap the non-Send parts behind a thread-safe mechanism proven sound.
| unsafe impl Send for ScoreSupervisorAPIClient {} // Just assuming it's safe to send across threads, this is a temporary solution |
| impl ScoreSupervisorAPIClient { | ||
| pub fn new() -> Self { | ||
| let value = std::env::var("IDENTIFIER").expect("IDENTIFIER env not set"); | ||
| debug!("ScoreSupervisorAPIClient: Creating with IDENTIFIER={}", value); | ||
| // This is only temporary usage so unwrap is fine here. | ||
| let supervisor_link = monitor_rs::Monitor::<Checks>::new(&value).expect("Failed to create supervisor_link"); | ||
| Self { supervisor_link } |
There was a problem hiding this comment.
ScoreSupervisorAPIClient::new() panics on missing IDENTIFIER (via expect) and on monitor_rs::Monitor::new failure (via expect). Since HealthMonitor::start() calls this constructor directly in non-test builds, this can crash production processes instead of returning an error. Consider returning Result<ScoreSupervisorAPIClient, HealthMonitorError> (or mapping to FFICode in FFI) and propagating the failure through HealthMonitor::start() / HealthMonitorBuilder::build().
| impl ScoreSupervisorAPIClient { | |
| pub fn new() -> Self { | |
| let value = std::env::var("IDENTIFIER").expect("IDENTIFIER env not set"); | |
| debug!("ScoreSupervisorAPIClient: Creating with IDENTIFIER={}", value); | |
| // This is only temporary usage so unwrap is fine here. | |
| let supervisor_link = monitor_rs::Monitor::<Checks>::new(&value).expect("Failed to create supervisor_link"); | |
| Self { supervisor_link } | |
| #[derive(Debug)] | |
| pub enum ScoreSupervisorAPIClientError { | |
| MissingIdentifier(std::env::VarError), | |
| MonitorInitFailed, | |
| } | |
| impl ScoreSupervisorAPIClient { | |
| pub fn new() -> Result<Self, ScoreSupervisorAPIClientError> { | |
| let value = std::env::var("IDENTIFIER") | |
| .map_err(ScoreSupervisorAPIClientError::MissingIdentifier)?; | |
| debug!("ScoreSupervisorAPIClient: Creating with IDENTIFIER={}", value); | |
| let supervisor_link = monitor_rs::Monitor::<Checks>::new(&value) | |
| .map_err(|_| ScoreSupervisorAPIClientError::MonitorInitFailed)?; | |
| Ok(Self { supervisor_link }) |
Add heartbeat monitor HMON.
Add C++ interface to heartbeat monitor.
Add logic monitor to HMON.
Add C++ interface to logic monitor.
7d6edc7 to
1df795e
Compare
Add C++ interface to logic monitor.
Resolves #16