-
Notifications
You must be signed in to change notification settings - Fork 17
Closed
Labels
verifiedAll test cases were verified successfullyAll test cases were verified successfully
Milestone
Description
The current node-monitor alarm logic is based on the SwapFree/SwapTotal ratio. However after an high memory usage peak the ratio remains high and the alarm is not reset.
Improve the monitoring tool's ability to detect high Linux swap usage by adding a secondary factor to the existing SwapFree/SwapTotal ratio check. The enhancement should provide better insight into memory pressure during and after swap peaks.
Requirements:
-
Current Metric:
- Continue using the
/proc/meminfoSwapFree/SwapTotal ratio to monitor overall swap utilization.
- Continue using the
-
New Metric:
- Introduce monitoring of
/proc/vmstatfor:pswpin: Number of pages swapped into memory.pswpout: Number of pages swapped out of memory.
- Implement tracking of the rate of change in these counters over a configurable interval (e.g., every 10 seconds).
- Introduce monitoring of
-
Thresholds:
- Add configurable thresholds for both
pswpinandpswpoutrates to trigger alerts when they exceed a certain limit (indicating sustained memory pressure).
- Add configurable thresholds for both
-
Alerting Logic:
- Trigger an alert if:
- The SwapFree/SwapTotal ratio raises above the defined threshold and
- A high rate of
pswpinorpswpoutevents is detected over a defined period (indicating active swap use despite acceptable swap levels).
- Trigger an alert if:
Discussion @nrauso https://mattermost.nethesis.it/nethesis/pl/5kgxh85pep8atgjd1ppce64ikr
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
verifiedAll test cases were verified successfullyAll test cases were verified successfully
Type
Projects
Status
Done