Skip to content

Stability fixes for discovery and reconnection#83

Open
ttherbrink wants to merge 1 commit intomasterfrom
fix/stability
Open

Stability fixes for discovery and reconnection#83
ttherbrink wants to merge 1 commit intomasterfrom
fix/stability

Conversation

@ttherbrink
Copy link
Copy Markdown
Contributor

This PR includes some stability fixes for issues that might present at random.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets intermittent stability issues in KNX interface discovery and reconnection by making discovery more resilient to non-KNX UDP traffic and improving reconnection/read timeout behavior.

Changes:

  • Enabled periodic KNX interface discovery via a 3-minute setInterval loop.
  • Adjusted discovery parsing/handling to skip non-KNX UDP packets rather than failing the discovery run.
  • Improved reconnection/read behavior in KNXInterface (connection health gating and clearing read timeouts).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
lib/KNXInterfaceManager.js Enables periodic discovery, refines interface discovery/update flow, and changes handling of non-KNX UDP packets.
lib/KNXInterface.js Tweaks connection (re)initialization conditions and fixes read timeout cleanup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +51 to +56
this.findDevicesInterval = setInterval(() => {
this.findKNXInterfaces()
.catch((error) => {
this.log('findKNXinterfaces error', error);
});
}, FIND_DEVICES_INTERVAL);
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The periodic discovery interval is started in the constructor but there’s no corresponding cleanup (e.g., clearInterval) when the app stops or if a new KNXInterfaceManager instance is created. This can leave multiple discovery loops running and keep resources alive longer than intended. Consider adding a destroy()/onUninit-style cleanup that clears this.findDevicesInterval (and have the app call it on shutdown/reload).

Copilot uses AI. Check for mistakes.
Comment on lines 221 to 224
const knxInterface = this.parseKNXResponse(msg);
if (!knxInterface) {
return reject(new Error('No valid KNXnet response'));
return null; // Skip non-KNX messages, keep listening for valid responses
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseKNXResponse() logs for every non-KNX UDP packet, and the new behavior here keeps listening instead of failing fast. On noisy networks this can turn into high-volume log spam during each discovery window. Consider silencing this path (or gating it behind a debug flag / rate limiting) when returning null for non-KNX messages.

Copilot uses AI. Check for mistakes.
Comment on lines 315 to 322
const udpSocket = dgram.createSocket('udp4'); // Create the socket connections

return new Promise((resolve, reject) => {
let interfaceFound = false;

udpSocket
.on('error', (err) => {
// If the udp server errors, close the connection
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside this Promise, the UDP message handler currently does if (!commResult) return new Error('No valid KNX message parsed');. Returning an Error object from an event handler doesn’t propagate or reject the Promise, so it’s misleading and wastes work. Prefer return null/return (or explicitly reject(...) if you want the check to fail fast) so the control flow is correct and clear.

Copilot uses AI. Check for mistakes.
Comment on lines 359 to 364
} catch (e) {
// socket may already be closed
}
this.interfaceFound = true;
interfaceFound = true;
this.searchRunning = false;

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The udpSocket.send(..., (err) => { if (err) throw err; }) path (in this same Promise) can crash the process because the exception is thrown from an async callback. Handle it like other error paths: close the socket, reset this.searchRunning, and reject(err) instead of throwing.

Copilot uses AI. Check for mistakes.
Comment on lines 391 to 397
.bind(36711);

setTimeout(() => {
if (!this.interfaceFound) {
if (!interfaceFound) {
try {
udpSocket.close();
} catch (error) {
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error occurs while sending the initial connect request (in the listening handler), the Promise is rejected but this.searchRunning may remain true until the timeout runs. Consider resetting this.searchRunning and closing the socket immediately on send failure to avoid temporarily blocking other discovery/check calls.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@wouter-athom wouter-athom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Just make sure re-enabling the old code won't break anything :)

Comment on lines +51 to +56
this.findDevicesInterval = setInterval(() => {
this.findKNXInterfaces()
.catch((error) => {
this.log('findKNXinterfaces error', error);
});
}, FIND_DEVICES_INTERVAL);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this was here already but commented out makes me wonder why. Was there a good reason for not doing it like this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not find a good reason. It was disabled over 5 years ago so no way to know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants