hmon: approach to monitors rework#81
Conversation
License Check Results🚀 The license check job ran with the Bazel command: bazel run //:license-checkStatus: Click to expand output |
|
The created documentation from the pull request is available at: docu-html |
fda115c to
9e958f1
Compare
9e958f1 to
88e93ee
Compare
88e93ee to
f08a948
Compare
There was a problem hiding this comment.
Pull request overview
This pull request refactors the health monitoring library to remove unnecessary abstractions, improve type safety, and better organize the codebase. The changes introduce separate MonitorTag and DeadlineTag types to replace the generic IdentTag, eliminate the MonitorEvaluator trait in favor of direct usage of Arc<DeadlineMonitorInner>, and reorganize the supervisor API client code into a dedicated module with proper feature flags.
Changes:
- Introduced
MonitorTagandDeadlineTagtypes for better type safety and clarity - Removed
MonitorEvaluatortrait abstraction andMonitorEvalHandlewrapper - Moved supervisor API client implementations to dedicated module with mutually exclusive feature flags
- Improved documentation throughout the codebase
- Fixed Cargo build configuration with proper feature flag management
Reviewed changes
Copilot reviewed 26 out of 28 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/health_monitoring_lib/rust/tag.rs | New module implementing MonitorTag and DeadlineTag with FFI-safe memory management |
| src/health_monitoring_lib/rust/worker.rs | Updated to use DeadlineMonitorInner directly and import SupervisorAPIClient from new module |
| src/health_monitoring_lib/rust/supervisor_api_client/*.rs | New module structure with feature-gated implementations |
| src/health_monitoring_lib/rust/lib.rs | Refactored monitor state management using MonitorState enum |
| src/health_monitoring_lib/rust/ffi.rs | Updated FFI functions to use new tag types |
| src/health_monitoring_lib/rust/deadline/*.rs | Updated to use new tag types and expose DeadlineMonitorInner |
| src/health_monitoring_lib/rust/common.rs | Removed IdentTag, MonitorEvaluator, and related types |
| src/health_monitoring_lib/cpp/include/score/hm/tag.h | New C++ tag types for FFI compatibility |
| src/health_monitoring_lib/cpp/include/score/hm/*.h | Updated to use new tag types |
| src/health_monitoring_lib/cpp/*.cpp | Updated implementations to use new tag types |
| src/health_monitoring_lib/Cargo.toml | Added feature flags for supervisor API client selection |
| Cargo.toml | Set default-members and removed global monitor_rs dependency |
| src/health_monitoring_lib/BUILD | Added crate_features configuration |
| .github/workflows/lint_clippy.yml | Updated to test both feature flag configurations |
| examples/* | Updated to use new tag types |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f08a948 to
df8135d
Compare
df8135d to
08e6e6d
Compare
There was a problem hiding this comment.
This change was meant to be placed in other PR.
08e6e6d to
6afc701
Compare
- Fix Cargo for `health_monitoring_lib`. - Fix leftover test changes from previous PR. - Small code changes.
6afc701 to
94920a7
Compare
94920a7 to
a41b786
Compare
- Less restrictive feature flags. - Additional small code changes.
- Remove unnecessary abstraction from monitors. - Improve docs. - Move `FFIBorrowed` and `FFIHandle` to main `ffi` module. - Small esthetics fixes.
- Restore some parts. - Separate errors into groups.
a41b786 to
c85301c
Compare
- Add `HealthMonitorError`. - Not fully utilized in this change, required for future monitors. - Simplify implementation of `start` and `build`. - No longer must be reimplemented for FFI. - New unit tests for `lib.rs`.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let monitoring_logic = worker::MonitoringLogic::new( | ||
| monitors, | ||
| collected_monitors, | ||
| self.supervisor_api_cycle, | ||
| // Currently only `ScoreSupervisorAPIClient` and `StubSupervisorAPIClient` are supported. | ||
| // The later is meant to be used for testing purposes. | ||
| #[cfg(not(any(test, feature = "stub_supervisor_api_client")))] | ||
| worker::ScoreSupervisorAPIClient::new(), | ||
| #[cfg(any(test, feature = "stub_supervisor_api_client"))] | ||
| worker::StubSupervisorAPIClient {}, | ||
| #[cfg(all(not(test), feature = "score_supervisor_api_client"))] | ||
| supervisor_api_client::score_supervisor_api_client::ScoreSupervisorAPIClient::new(), | ||
| #[cfg(any( | ||
| test, | ||
| all(feature = "stub_supervisor_api_client", not(feature = "score_supervisor_api_client")) | ||
| ))] | ||
| supervisor_api_client::stub_supervisor_api_client::StubSupervisorAPIClient::new(), | ||
| ); |
There was a problem hiding this comment.
The conditional compilation logic has several edge cases that would cause compilation failures. When only score_supervisor_api_client feature is enabled (without stub), test builds would fail because neither cfg condition would match (line 259 requires not(test), line 262-263 requires test OR stub without score). Similarly, when neither feature is enabled, all builds would fail. Consider adding a compile_error! to explicitly handle unsupported feature combinations, or ensure the cfg conditions cover all valid scenarios, particularly test builds when only score_supervisor_api_client is enabled.
| // Should never occur if created by `HealthMonitorBuilder`! | ||
| let num_monitors = self.deadline_monitors.len(); | ||
| if num_monitors == 0 { | ||
| error!("No monitors have been added. HealthMonitor cannot be created."); |
There was a problem hiding this comment.
The error message says "HealthMonitor cannot be created" but this error occurs in the start() method, not during creation. The HealthMonitor was already created by the builder. Consider changing the message to "HealthMonitor cannot be started without monitors" or similar to accurately reflect the operation that is failing.
| error!("No monitors have been added. HealthMonitor cannot be created."); | |
| error!("No monitors have been added. HealthMonitor cannot be started without monitors."); |
| /// This method shall be called before `Lifecycle.running()`. | ||
| /// Otherwise the supervisor might consider the process not alive. | ||
| /// | ||
| /// Health monitoring logic stop when the [`HealthMonitor`] is dropped. |
There was a problem hiding this comment.
Documentation says "Health monitoring logic stop when the [HealthMonitor] is dropped" but this is grammatically incorrect. Consider changing to "Health monitoring logic stops when the [HealthMonitor] is dropped" to fix the verb conjugation.
| /// Health monitoring logic stop when the [`HealthMonitor`] is dropped. | |
| /// Health monitoring logic stops when the [`HealthMonitor`] is dropped. |
| MonitorEvaluationError::Heartbeat => unimplemented!(), | ||
| MonitorEvaluationError::Logic => unimplemented!(), |
There was a problem hiding this comment.
The MonitorEvaluationError::Heartbeat and MonitorEvaluationError::Logic variants call unimplemented!() which will panic if ever reached. While these appear to be placeholders for future monitor types, consider using compile-time feature flags or removing these variants until they are actually implemented to avoid unexpected runtime panics. Alternatively, add clear documentation indicating these are not yet supported and should not be used.
| MonitorEvaluationError::Heartbeat => unimplemented!(), | |
| MonitorEvaluationError::Logic => unimplemented!(), | |
| MonitorEvaluationError::Heartbeat => { | |
| warn!( | |
| "Heartbeat monitor with tag {:?} reported an error.", | |
| monitor_tag | |
| ) | |
| }, | |
| MonitorEvaluationError::Logic => { | |
| warn!( | |
| "Logic monitor with tag {:?} reported an error.", | |
| monitor_tag | |
| ) | |
| }, |
| fn notify_alive(&self); | ||
| } | ||
|
|
||
| // NOTE: various implementations are not mutually exclusive. |
There was a problem hiding this comment.
The comment states that various implementations are not mutually exclusive, but the feature flag configuration in lib.rs lines 261-265 suggests they should be. When both features are enabled, the code uses score_supervisor_api_client in production and stub_supervisor_api_client in tests, which means they are used in different contexts. Consider clarifying this comment to explain that while both features can be enabled, only one implementation is used at runtime based on conditional compilation.
| // NOTE: various implementations are not mutually exclusive. | |
| // NOTE: multiple implementations can be enabled via features at the same time, | |
| // but conditional compilation ensures that only one is used in a given build | |
| // context (e.g., production vs tests), as configured in lib.rs. |
Uh oh!
There was an error while loading. Please reload this page.