Conversation
- Fix Cargo for `health_monitoring_lib`. - Fix leftover test changes from previous PR. - Small code changes.
- Less restrictive feature flags. - Additional small code changes.
- Remove unnecessary abstraction from monitors. - Improve docs. - Move `FFIBorrowed` and `FFIHandle` to main `ffi` module. - Small esthetics fixes.
- Restore some parts. - Separate errors into groups.
- Add `HealthMonitorError`. - Not fully utilized in this change, required for future monitors. - Simplify implementation of `start` and `build`. - No longer must be reimplemented for FFI. - New unit tests for `lib.rs`.
Add heartbeat monitor HMON.
Add C++ interface to heartbeat monitor.
Add logic monitor to HMON.
License Check Results🚀 The license check job ran with the Bazel command: bazel run //:license-checkStatus: Click to expand output |
There was a problem hiding this comment.
Pull request overview
This PR extends the health monitoring library with new monitor types (heartbeat + logic) and refactors the worker/supervisor client plumbing so the worker can evaluate multiple monitor kinds and report typed evaluation errors across Rust + C++/FFI.
Changes:
- Add new
HeartbeatMonitor(state + evaluator + FFI + C++ wrapper) andLogicMonitor(state-transition validator). - Refactor monitor evaluation to pass an HMON starting
Instantand to report typedMonitorEvaluationErrorvariants (deadline/heartbeat/logic). - Move supervisor API client implementations into a feature-gated
supervisor_api_client/module and adjust build config defaults/features.
Reviewed changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/health_monitoring_lib/rust/worker.rs | Updates monitoring loop to pass a shared starting Instant and log typed monitor errors; removes in-file supervisor client impls. |
| src/health_monitoring_lib/rust/tag.rs | Adds StateTag newtype for logic monitor states; updates tag tests. |
| src/health_monitoring_lib/rust/supervisor_api_client/stub_supervisor_api_client.rs | Adds stub supervisor client implementation behind feature gate. |
| src/health_monitoring_lib/rust/supervisor_api_client/score_supervisor_api_client.rs | Adds SCORE supervisor client implementation (monitor_rs-backed). |
| src/health_monitoring_lib/rust/supervisor_api_client/mod.rs | Introduces SupervisorAPIClient trait and feature-gated implementations. |
| src/health_monitoring_lib/rust/logic/mod.rs | Adds logic module exports. |
| src/health_monitoring_lib/rust/logic/logic_monitor.rs | Implements logic monitor state machine + evaluator + tests. |
| src/health_monitoring_lib/rust/lib.rs | Integrates deadline/heartbeat/logic monitors into HealthMonitorBuilder + HealthMonitor; changes build/start to return Result. |
| src/health_monitoring_lib/rust/heartbeat/mod.rs | Adds heartbeat module wiring and FFI submodule. |
| src/health_monitoring_lib/rust/heartbeat/heartbeat_state.rs | Adds compact atomic heartbeat state representation (+ loom-aware atomics). |
| src/health_monitoring_lib/rust/heartbeat/heartbeat_monitor.rs | Adds heartbeat monitor implementation + builder validation + tests. |
| src/health_monitoring_lib/rust/heartbeat/ffi.rs | Adds FFI for heartbeat monitor builder/monitor and tests. |
| src/health_monitoring_lib/rust/ffi.rs | Extends core FFI to build/start using Result, adds heartbeat builder/get FFI, and maps HealthMonitorError to FFICode. |
| src/health_monitoring_lib/rust/deadline/mod.rs | Re-exports DeadlineEvaluationError for typed evaluation errors. |
| src/health_monitoring_lib/rust/deadline/deadline_monitor.rs | Refactors deadline monitor to use typed DeadlineEvaluationError and updated evaluator signature. |
| src/health_monitoring_lib/rust/common.rs | Introduces HasEvalHandle, typed MonitorEvaluationError, shared evaluation signature, and time conversion helpers. |
| src/health_monitoring_lib/cpp/tests/health_monitor_test.cpp | Extends C++ test to build/get/use heartbeat monitor. |
| src/health_monitoring_lib/cpp/include/score/hm/heartbeat/heartbeat_monitor.h | Adds C++ API surface for heartbeat monitor + builder. |
| src/health_monitoring_lib/cpp/include/score/hm/health_monitor.h | Adds heartbeat monitor APIs to C++ HealthMonitor(Builder). |
| src/health_monitoring_lib/cpp/heartbeat_monitor.cpp | Implements C++ heartbeat monitor wrapper calling Rust FFI. |
| src/health_monitoring_lib/cpp/health_monitor.cpp | Wires heartbeat builder/get calls through Rust FFI. |
| src/health_monitoring_lib/Cargo.toml | Adds features (default stub; optional score uses monitor_rs) and loom dependency config. |
| src/health_monitoring_lib/BUILD | Adds heartbeat C++ sources/headers and adjusts crate feature flags for Bazel targets. |
| examples/rust_supervised_app/src/main.rs | Updates example to handle build()/start() returning Result. |
| Cargo.toml | Sets workspace default-members and adds cfg(loom) lint configuration. |
| Cargo.lock | Adds new dependency entries (loom + transitive deps). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let internal_processing_cycle_ms = internal_processing_cycle.as_millis() as u64; | ||
| if range_min_ms * 2 <= internal_processing_cycle_ms { | ||
| error!( | ||
| "Internal processing cycle duration ({} ms) must be longer than two shortest allowed ranges ({} ms).", |
There was a problem hiding this comment.
The validation condition rejects large internal processing cycles, but the logged error message says the internal processing cycle "must be longer" than the allowed ranges. This is contradictory to the check and will confuse users when diagnosing invalid configurations. Please update the message (or the check) so they describe the same rule (e.g., that the internal processing cycle must be sufficiently short relative to the minimum heartbeat range).
| "Internal processing cycle duration ({} ms) must be longer than two shortest allowed ranges ({} ms).", | |
| "Internal processing cycle duration ({} ms) must be shorter than twice the shortest allowed range (2 × {} ms).", |
| impl InternalRange { | ||
| /// Create range using provided values. | ||
| fn new(min: u32, max: u32) -> Self { | ||
| assert!(min <= max, "provided min is greater than provided max"); |
There was a problem hiding this comment.
InternalRange::new uses an assert!(min <= max, ...), which can panic from user-provided TimeRange values (including via FFI). Since the public APIs in this crate are moving toward returning Result/error codes instead of panicking, consider validating the range and returning HealthMonitorError::InvalidArgument rather than asserting.
| assert!(min <= max, "provided min is greater than provided max"); | |
| if min > max { | |
| error!( | |
| "InternalRange::new called with invalid range: min ({}) > max ({})", | |
| min, max | |
| ); | |
| // Fall back to a zero-length range at `min` to avoid panicking on invalid input. | |
| return Self { min, max: min }; | |
| } |
| match value { | ||
| value if value == LogicEvaluationError::InvalidState as u8 => LogicEvaluationError::InvalidState, | ||
| value if value == LogicEvaluationError::InvalidTransition as u8 => LogicEvaluationError::InvalidTransition, | ||
| _ => panic!("Invalid underlying value of logic evaluation error."), |
There was a problem hiding this comment.
impl From<u8> for LogicEvaluationError panics on unknown values. Since this conversion is used when interpreting AtomicU8 state during evaluate, an unexpected value would crash the monitoring thread/process instead of reporting an error. Please make this conversion non-panicking (e.g., map unknown values to InvalidState or return Option/Result) and handle that path in evaluate.
| _ => panic!("Invalid underlying value of logic evaluation error."), | |
| _ => LogicEvaluationError::InvalidState, |
| unsafe impl Send for ScoreSupervisorAPIClient {} // Just assuming it's safe to send across threads, this is a temporary solution | ||
|
|
There was a problem hiding this comment.
unsafe impl Send for ScoreSupervisorAPIClient is currently justified only by a comment (“just assuming it's safe”). If monitor_rs::Monitor<Checks> is not actually thread-safe, this is unsound and can lead to undefined behavior when the client is moved to the worker thread. Please either remove the unsafe impl Send (and adjust the design) or document and enforce the invariants that make it sound (e.g., prove monitor_rs::Monitor<Checks> is Send/thread-safe or wrap access behind synchronization).
| unsafe impl Send for ScoreSupervisorAPIClient {} // Just assuming it's safe to send across threads, this is a temporary solution |
| // Should never occur if created by `HealthMonitorBuilder`! | ||
| let num_monitors = self.deadline_monitors.len() + self.heartbeat_monitors.len() + self.logic_monitors.len(); | ||
| if num_monitors == 0 { | ||
| error!("No monitors have been added. HealthMonitor cannot be created."); |
There was a problem hiding this comment.
This log message is emitted from HealthMonitor::start, but says the monitor "cannot be created". Consider changing it to "cannot be started" (or similar) so error logs reflect the operation that failed.
| error!("No monitors have been added. HealthMonitor cannot be created."); | |
| error!("No monitors have been added. HealthMonitor cannot be started."); |
| assert!(value < 1 << 29, "provided heartbeat offset is out of range"); | ||
| self.0 = ((value as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK); |
There was a problem hiding this comment.
set_heartbeat_timestamp_offset uses an assert! for the 29-bit limit, which will panic if a heartbeat arrives after a long delay (e.g., if now - start_ts exceeds ~6 days in ms). Since this can be triggered by runtime conditions, consider handling overflow gracefully (e.g., clamp/saturate, reset cycle state, or return/report an error) rather than panicking.
| assert!(value < 1 << 29, "provided heartbeat offset is out of range"); | |
| self.0 = ((value as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK); | |
| let clamped = min(value, (1 << 29) - 1); | |
| self.0 = ((clamped as u64) << BEAT_OFFSET) | (self.0 & !BEAT_MASK); |
|
The created documentation from the pull request is available at: docu-html |
Add logic monitor to HMON.
Resolved #96