feat: Add configurable stale_timeout for APRS-IS connections#223
Merged
feat: Add configurable stale_timeout for APRS-IS connections#223
Conversation
Add a new 'stale_timeout' configuration option to the aprs_network config group that allows users to customize how long to wait before considering an APRS-IS connection stale. Problem: The stale connection threshold was hardcoded to 2 minutes. In environments with frequent network hiccups or when using certain APRS-IS servers that may drop connections silently, 2 minutes can be too long to wait before reconnecting, resulting in significant data loss. Solution: - Add 'stale_timeout' option to aprsd/conf/client.py with default of 120s - Update APRSISDriver.__init__ to use the config value - Maintain backward compatibility by defaulting to 120s if not configured - Update tests to handle the new configuration option Usage: [aprs_network] stale_timeout = 60 # Reconnect after 60 seconds without data The default remains 120 seconds (2 minutes) for backward compatibility.
The APRSISDriver uses @singleton decorator which transforms the class into a function. The test was incorrectly trying to use __new__ which doesn't work with decorated singletons. Instead, re-initialize the existing instance after changing the config.
The singleton's max_delta was being modified by test_init_custom_stale_timeout and not restored, causing test_is_stale_connection_false to fail because it expected 2 minutes but got 60 seconds.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add configurable
stale_timeoutfor APRS-IS connections to allow faster detection of dead connections.Problem
The stale connection threshold is hardcoded to 2 minutes in
APRSISDriver.__init__(). In production environments using thelistencommand with MQTT plugin, I observed connection stalls occurring every 6-7 minutes. The 2-minute detection delay results in significant data loss (~9 reconnects/hour with 2+ minutes of lost data each time).The issue appears to be silent TCP connection failures to APRS-IS servers where the socket stays "connected" but no data flows. The TCP keepalive settings don't help because the connection isn't truly dead - it's just that data stops flowing (possibly due to APRS2.net load balancer behavior, NAT timeouts, or network issues).
Solution
Add a
stale_timeoutconfiguration option that defaults to 120 seconds for backward compatibility but allows users to reduce it for faster recovery.Changes:
stale_timeoutconfig option (default: 120 seconds)APRSISDriver.__init__to use the config value with backward compatibility fallbackUsage
Backward Compatibility
The default remains 120 seconds (2 minutes), and if the option is not present in the config, the code falls back to the same default. This ensures existing configurations continue to work without changes.