Skip to content

NMS-19403: Add device specific metrics for Traps #8271

Open
cgorantla wants to merge 10 commits intofoundation-2024from
cg/jira/NMS-19403
Open

NMS-19403: Add device specific metrics for Traps #8271
cgorantla wants to merge 10 commits intofoundation-2024from
cg/jira/NMS-19403

Conversation

@cgorantla
Copy link
Contributor

This will add device (location:ipAddress) specific metrics for Trap processing.
These metrics are not enabled by default.
Need to enable with specific system property on opennms and through osgi config on Minion

External References

This will add device (location:ipAddress) specific metrics for Trap processing.
These metrics are not enabled by default. Need to enable with specific system property
@cgorantla cgorantla requested review from christianpape, Copilot and indigo423 and removed request for indigo423 February 2, 2026 23:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds device-specific metrics for SNMP trap processing, allowing operators to track trap processing statistics per device (identified by location and IP address). The feature is disabled by default and can be enabled via system property on OpenNMS or OSGi configuration on Minion.

Changes:

  • Added per-device trap metrics tracked by location and IP address
  • Removed deprecated trapsDispatched metric from the codebase
  • Added new JMX MBean interfaces and implementations for device-level metrics

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
opennms-container/minion/container-fs/prom-jmx-default-config.yaml Adds JMX exporter pattern for device-specific trap metrics
opennms-container/minion/container-fs/confd/templates/prom-jmx-exporter.yaml.tmpl Adds template configuration for device trap metrics in Minion
opennms-container/core/container-fs/confd/templates/prom-jmx-exporter.yaml.tmpl Adds template configuration for device trap metrics in OpenNMS core
features/events/traps/src/main/resources/OSGI-INF/blueprint/blueprint-trapd-listener.xml Adds OSGi configuration property and Identity reference for device metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/TrapdMBean.java Removes deprecated getTrapsDispatched() method
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/TrapdInstrumentation.java Adds device-level metrics tracking and registry management
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/Trapd.java Removes getTrapsDispatched() implementation
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/DeviceTrapMetricsRegistry.java New registry class for managing per-device JMX MBeans
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/DeviceTrapMetricsMBean.java New MBean interface for listener-side device metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/DeviceTrapMetrics.java Implementation of listener-side device metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/DeviceConsumerTrapMetricsMBean.java New MBean interface for consumer-side device metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/jmx/DeviceConsumerTrapMetrics.java Implementation of consumer-side device metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/TrapSinkConsumer.java Updates trap processing to track device-specific metrics
features/events/traps/src/main/java/org/opennms/netmgt/trapd/TrapListenerMetrics.java Adds device metrics registry and removes trapsDispatched counter
features/events/traps/src/main/java/org/opennms/netmgt/trapd/TrapListener.java Updates trap reception to track device-specific metrics and removes dispatch counting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@christianpape christianpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The only issue is that maybe shutdown() will not work as expected in the case of IPv6 addresses.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

christianpape
christianpape previously approved these changes Feb 3, 2026
Copy link
Contributor

@christianpape christianpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Base automatically changed from cg/jira/NMS-19279 to foundation-2024 February 3, 2026 16:45
@cgorantla cgorantla dismissed christianpape’s stale review February 3, 2026 16:45

The base branch was changed.

@github-actions github-actions bot added the docs label Feb 3, 2026
@github-actions github-actions bot requested a review from indigo423 February 3, 2026 17:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

christianpape
christianpape previously approved these changes Feb 4, 2026
Copy link
Contributor

@christianpape christianpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@indigo423
Copy link
Member

indigo423 commented Feb 5, 2026

@cgorantla I've tested it with a few nodes, here are the metrics per IP from the Core:

# HELP org_opennms_netmgt_trapd_device_default_trapsdiscarded Attribute exposed for management org.opennms.netmgt.trapd.device:name=null,type=consumer,attribute=TrapsDiscarded
# TYPE org_opennms_netmgt_trapd_device_default_trapsdiscarded untyped
org_opennms_netmgt_trapd_device_default_trapsdiscarded{ip="\"10.42.2.1\"",type="consumer"} 0.0
# HELP org_opennms_netmgt_trapd_device_default_trapserrored Attribute exposed for management org.opennms.netmgt.trapd.device:name=null,type=consumer,attribute=TrapsErrored
# TYPE org_opennms_netmgt_trapd_device_default_trapserrored untyped
org_opennms_netmgt_trapd_device_default_trapserrored{ip="\"10.42.2.1\"",type="consumer"} 0.0
# HELP org_opennms_netmgt_trapd_device_default_trapsreceived Attribute exposed for management org.opennms.netmgt.trapd.device:name=null,type=consumer,attribute=TrapsReceived
# TYPE org_opennms_netmgt_trapd_device_default_trapsreceived untyped
org_opennms_netmgt_trapd_device_default_trapsreceived{ip="\"10.42.2.1\"",type="consumer"} 100.0

Metrics are exposed per IP address. We need to tweak the Prometheus configuration so that we also expose the location attribute as well. An IP address can be in many locations and isn't unique enough. The JMX export in Hawtio gives the location, it seems this is something we need to fix in the JMX Prometheus Exporter template.

Screenshot 2026-02-05 at 13 08 04

To make it work in Prometheus we need the location as a label like this:

org_opennms_netmgt_trapd_device_trapsreceived{location="Default", ip="10.42.2.1",type="consumer"} 100.0

The \" might create some issues as well.

@indigo423
Copy link
Member

Screenshot 2026-02-05 at 18 07 52

We can see the JMX metrics on the Minion, but we don't get them

indigo@debian13:~/opennms-playground/ipc-kafka/container-fs/templates$ curl -s http://minion:9399/metrics | grep trapd

The output looks like this:

# HELP minion_subscriber_trapd_listener_config_requestsent_type_counters_count_total Attribute exposed for management org.opennms.core.ipc.twin.subscriber:name=trapd.listener.config.requestSent,type=counters,attribute=Count
# TYPE minion_subscriber_trapd_listener_config_requestsent_type_counters_count_total counter
minion_subscriber_trapd_listener_config_requestsent_type_counters_count_total 8.0
# HELP minion_subscriber_trapd_listener_config_updatereceived_type_counters_count_total Attribute exposed for management org.opennms.core.ipc.twin.subscriber:name=trapd.listener.config.updateReceived,type=counters,attribute=Count
# TYPE minion_subscriber_trapd_listener_config_updatereceived_type_counters_count_total counter
minion_subscriber_trapd_listener_config_updatereceived_type_counters_count_total 1.0
# HELP minion_trapd_batchsize_type_gauges Attribute exposed for management org.opennms.netmgt.trapd:name=batchSize,type=gauges,attribute=Value
# TYPE minion_trapd_batchsize_type_gauges gauge
minion_trapd_batchsize_type_gauges 1000.0
# HELP minion_trapd_currentqueuesize_type_gauges Attribute exposed for management org.opennms.netmgt.trapd:name=currentQueueSize,type=gauges,attribute=Value
# TYPE minion_trapd_currentqueuesize_type_gauges gauge
minion_trapd_currentqueuesize_type_gauges 0.0
# HELP minion_trapd_maxqueuesize_type_gauges Attribute exposed for management org.opennms.netmgt.trapd:name=maxQueueSize,type=gauges,attribute=Value
# TYPE minion_trapd_maxqueuesize_type_gauges gauge
minion_trapd_maxqueuesize_type_gauges 10000.0
# HELP minion_trapd_rawtrapsreceived_type_counters_count_total Attribute exposed for management org.opennms.netmgt.trapd:name=rawTrapsReceived,type=counters,attribute=Count
# TYPE minion_trapd_rawtrapsreceived_type_counters_count_total counter
minion_trapd_rawtrapsreceived_type_counters_count_total 200.0
# HELP minion_trapd_trapserrored_type_counters_count_total Attribute exposed for management org.opennms.netmgt.trapd:name=trapsErrored,type=counters,attribute=Count
# TYPE minion_trapd_trapserrored_type_counters_count_total counter
minion_trapd_trapserrored_type_counters_count_total 0.0

The metric for 10.42.0.134 in location "ipc-kafka" isn't provided correctly.

christianpape
christianpape previously approved these changes Feb 6, 2026
Copy link
Contributor

@christianpape christianpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@indigo423
Copy link
Member

indigo423 commented Feb 6, 2026

@cgorantla your patch solved the problem. Here is how it looks like on the Minion now:

Core:

trapd_device_trapsreceived_total{ip="10.42.0.134",location="ipc-kafka",type="consumer"} 100.0
trapd_device_trapsreceived_total{ip="10.42.2.1",location="Default",type="consumer"} 100.0

Minion:

trapd_device_rawtrapsreceived_total{ip="10.42.0.134",location="ipc-kafka",type="listener"} 100.0

A little detail here for Core:

I would have expected to see 2 metrics on core for type "consumer", one for location "ipc-kafka" and one for location "Default" + an additional trapd_device_rawtrapsreceived_total{ip="10.42.2.1",location="Default",type="listener"} where the listener is running on the Core?

Here is an attachment how these metric look like now on Core and Minion

@cgorantla
Copy link
Contributor Author

trapd_device_rawtrapsreceived_total{ip="10.42.2.1",location="Default",type="listener"} where the listener is running on the Core? - Yeah, this should also be there. Let me check again why it is missing

@cgorantla
Copy link
Contributor Author

On Core, I can see this

opennms_trapd_rawtrapsreceived{app="core", instance="core:9299", job="opennms-core"} | 107
-- | --

@indigo423
Copy link
Member

trapd_device_rawtrapsreceived_total{ip="10.42.2.1",location="Default",type="listener"} where the listener is running on the Core? - Yeah, this should also be there. Let me check again why it is missing

Yes I have a listener running on core on port 1162 and on the Minion port 1163 to test the behaviour for both use cases.

@indigo423
Copy link
Member

indigo423 commented Feb 6, 2026

On Core, I can see this

opennms_trapd_rawtrapsreceived{app="core", instance="core:9299", job="opennms-core"} | 107
-- | --

I would have expected to see it the same per "device" as it is on the Minion.

@cgorantla
Copy link
Contributor Author

trapd_device_rawtrapsreceived_total{ip="10.42.2.1",location="Default",type="listener"} where the listener is running on the Core?

Latest commit should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants