Skip to content

Conversation

@rdica
Copy link
Contributor

@rdica rdica commented Nov 6, 2025

Short description of changes

Provides SRV DNS support for -e|--directoryaddress option.

CHANGELOG: SKIP

Context

Currently one needs to provide both an IP/host and port number to the -e|--directoryaddress option if the directory server is not using the default port 22124. This patch will enable the ability to use preconfigured SRV DNS records by a server to connect to a directory without having to provide a port number.

The patch expands on SRV support in client code already in main.

I have created SRV DNS records to test with that point to each of the seven public directory servers provided by the Jamulus team:

  • anygenre1.jamulusjams.com
  • anygenre2.jamulusjams.com
  • anygenre3.jamulusjams.com
  • rock.jamulusjams.com
  • jazz.jamulusjams.com
  • classical.jamulusjams.com
  • choral.jamulusjams.com

You can confirm the SRV records using the following:
Mac/Linux

dig _jamulus._udp.anygenre1.jamulusjams.com srv

;; ANSWER SECTION:
_jamulus._udp.anygenre1.jamulusjams.com. 3600 IN SRV 0 0 22124 anygenre1.jamulus.io.

Windows

nslookup -type=srv _jamulus._udp.anygenre1.jamulusjams.com`

Server:  UnKnown
Address:  10.2.0.1

_jamulus._udp.anygenre1.jamulusjams.com SRV service location:
          priority       = 0
          weight         = 0
          port           = 22124
          svr hostname   = anygenre1.jamulus.io

In order to utilize this functionality for the Jamulus public space, the Jamulus team could create the seven SRV records and publish those in the same table that displays the server host/port pairs in https://jamulus.io/wiki/Running-a-Server#registered-mode

Does this change need documentation? What needs to be documented and how?

Unsure. According to --help output, the option -c|--connect doesn't mention anything about SRV support. In that same vein I submit nothing should be added for -e|--directoryaddress either.

Status of this Pull Request

What is missing until this pull request can be merged?

Checklist

  • I've verified that this Pull Request follows the general code principles
  • I tested my code and it does what I want
  • My code follows the style guide
  • I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
  • I've filled all the content above

@ann0see ann0see added this to Tracking Nov 9, 2025
@github-project-automation github-project-automation bot moved this to Triage in Tracking Nov 9, 2025
@ann0see
Copy link
Member

ann0see commented Nov 9, 2025

CC @gilgongo and @softins for DNS

@softins
Copy link
Member

softins commented Nov 10, 2025

CC @gilgongo and @softins for DNS

I have just created SRV records in our zone on Cloudflare for the various directories, as follows:

;; SRV Records
_jamulus._udp.anygenre1.jamulus.io.     60      IN      SRV     0 0 22124 anygenre1.jamulus.io.
_jamulus._udp.anygenre2.jamulus.io.     60      IN      SRV     0 0 22224 anygenre2.jamulus.io.
_jamulus._udp.anygenre3.jamulus.io.     60      IN      SRV     0 0 22624 anygenre3.jamulus.io.
_jamulus._udp.choral.jamulus.io.        60      IN      SRV     0 0 22724 choral.jamulus.io.
_jamulus._udp.classical.jamulus.io.     60      IN      SRV     0 0 22524 classical.jamulus.io.
_jamulus._udp.jazz.jamulus.io.          60      IN      SRV     0 0 22324 jazz.jamulus.io.
_jamulus._udp.private.jamulus.io.       60      IN      SRV     0 0 22124 private.jamulus.io.
_jamulus._udp.rock.jamulus.io.          60      IN      SRV     0 0 22424 rock.jamulus.io.

Once we are happy they are working correctly, we can wind the TTLs up from 1 minute to something longer.

@rdica
Copy link
Contributor Author

rdica commented Nov 10, 2025

@softins thanks, I have reconfigured one of my servers to use SRV for anygenre2.jamulus.io, monitoring.

Nov 10 18:21:03 daw6 jamulus[279573]: resolved anygenre2.jamulus.io to a single SRV record: anygenre2.jamulus.io:22224
Nov 10 18:21:03 daw6 jamulus[279573]: Server Registration Status update: Registration requested
Nov 10 18:21:03 daw6 jamulus[279573]: Server Registration Status update: Registered

softins
softins previously approved these changes Nov 11, 2025
Copy link
Member

@softins softins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. Tested on all standard directories, using both -e and the Custom setting in the server GUI.

@rdica
Copy link
Contributor Author

rdica commented Nov 14, 2025

While the SRV lookups work and servers can connect to a directory server, I'm seeing all my test servers lose their ability to maintain their registrations eventually, some indeterminate time within 24 hrs, and attempts to re-register stop being logged. Any clients that were connected to the server itself also get disconnected. Attempting to investigate further...

@rdica
Copy link
Contributor Author

rdica commented Dec 4, 2025

I still haven't been able to determine why servers that use the SRV record to register with a directory eventually lose their connections to one, but I also found that when using the new RPC method to enable/disable/change directory server to Any Genre1 the SRV record is used, thus the server eventually loses registration and is no longer connected to the directory. Other genres aren't affected apparently due to their address and port pairs being explicitly defined as single objects in global.h so SRV lookups aren't performed on those hostnames.

@softins
Copy link
Member

softins commented Dec 8, 2025

This is an interesting one. I would be interested to try to understand and diagnose it, as my available time permits. We need to do so before we can confidently add this feature.

@softins
Copy link
Member

softins commented Dec 8, 2025

@rdica to make the problem potentially happen more quickly, you could change SERVLIST_REGIST_INTERV_MINUTES from 15 to 1 minute in global.h. If you find clients get disconnected at the same time, it could be that the Jamulus server is getting locked up. It would be worth checking its memory consumption over time, and particularly when it has stopped working.

I am building a server with this change, and some extra debug output, to see.

I am rather suspicious of this code:

jamulus/src/util.cpp

Lines 760 to 774 in cb5a880

QDnsLookup* dns = new QDnsLookup();
dns->setType ( QDnsLookup::SRV );
dns->setName ( QString ( "_jamulus._udp.%1" ).arg ( strAddress ) );
dns->lookup();
// QDnsLookup::lookup() works asynchronously. Therefore, wait for
// it to complete here by resuming the main loop here.
// This is not nice and blocks the UI, but is similar to what
// the regular resolve function does as well.
QTime dieTime = QTime::currentTime().addMSecs ( DNS_SRV_RESOLVE_TIMEOUT_MS );
while ( QTime::currentTime() < dieTime && !dns->isFinished() )
{
QCoreApplication::processEvents ( QEventLoop::ExcludeUserInputEvents, 100 );
}
QList<QDnsServiceRecord> records = dns->serviceRecords();
dns->deleteLater();

I think deleteLater() is relying on an event loop to perform the deletion. When our ParseNetworkAddressSrv() resolver is called from the connect dialog, that may well happen ok, although since this function will only be called once when connecting, it will be light on resources anyway. However, when called by a headless server, it seems plausible that the deleteLater queue might not get serviced in the same way. And that resolver function gets called repeatedly, once for each registration refresh.

However, at the moment, that is only an educated guess.

@softins
Copy link
Member

softins commented Dec 9, 2025

I think it's something different from what I thought above. I ran two servers - one headless and one GUI - on the same machine, both registered to private.jamulus.io with a refresh interval of 1 minute. They both lasted about an hour or so before they stopped registering. I then found that they were both completely unresponsive, probably deadlocked in some way. The GUI server would not respond to any action, and neither server would respond to control-C to terminate, nor a plain kill from the command line. I needed to use kill -9. More investigation needed.

It may be that we just need to try a different implementation of the DNS lookup for SRV. I'm investigating that possibility too.

@softins softins dismissed their stale review December 11, 2025 18:52

Needs more work to understand why it locks up after running for a period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

3 participants