Skip to content

Improved db connection#12241

Open
qqmyers wants to merge 3 commits intoIQSS:developfrom
QualitativeDataRepository:Use_Non-pooling_data_source
Open

Improved db connection#12241
qqmyers wants to merge 3 commits intoIQSS:developfrom
QualitativeDataRepository:Use_Non-pooling_data_source

Conversation

@qqmyers
Copy link
Member

@qqmyers qqmyers commented Mar 20, 2026

What this PR does / why we need it: At QDR, we've recently see Dataverse stop responding without any out of memory or high load issues.

The one thing I've seen in the log that is potentially related are errors related to the postgres connection:

  • Exception DTX5007:Exception StackTrace javax.transaction.xa.XAException: jakarta.resource.ResourceException: This Managed Connection is not valid as the physical connection is not usable
  • RAR5031:System Exception
  • org.eclipse.persistence.exceptions.DatabaseException\nInternal Exception: org.postgre sql.util.PSQLException: Connection has been closed automatically because a new connection was opened for the same PooledConnection or the PooledConnection has been closed.\nError Code: 0\nCal l: SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ?) AND (LANG = ?))\n\tbind => [2 parameters bound]\nQuery: ReadAllQuery(name="Setting.findByName" referenceClass=Setting sql=" SELECT ID, CONTENT, LANG, NAME FROM SETTING WHERE ((NAME = ?) AND (LANG = ?))")"

Showing AI how the DataSource was configured, it pointed to an issue where we are specifying a pooling connector while Payara is also setting up a pool, causing duplication and potential issues under high load.

This PR switches to the recommended "Simple" connector and adds a new "fish.payara.fail-all-connections=${MPCONFIG=dataverse.db.fail-all-connections:true}" option that seems useful for recovering more quickly when connections are going bad.

While the relevant Dataverse code hasn't changed lately, it's possible that the update to Payara 7/new Eclipselink has changed something that is making the problem more visible.

Even if this is not related to the problem we're seeing at QDR, it seems like a useful update.

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer:

AI also recommended several settings that we don't have by default, which ~match Payara's documentation on advanced db settings. I don't know if changing the defaults in our code, or adding these to the recommended settings when upgrading/the docker default config would be useful:
-Ddataverse.db.is-connection-validation-required=true
-Ddataverse.db.validate-atmost-once-period-in-seconds=30
-Ddataverse.db.connection-validation-method=custom-validation
-Ddataverse.db.validation-classname=org.glassfish.api.jdbc.validation.PostgresConnectionValidation
-Ddataverse.db.fail-all-connections=true
-Ddataverse.db.connection-leak-timeout-in-seconds=300

Suggestions on how to test this:
Nominally make sure Dataverse works after the change, probably perf test and assure performance doesn't drop and performance under load is as/more stable than before.

FWIW: All of these changes are on QDR test machines - we'll be monitoring performance and hopefully pushing them to production where we've seen outages.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@qqmyers qqmyers added the GDCC: QDR of interest to QDR label Mar 20, 2026
@poikilotherm poikilotherm self-assigned this Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GDCC: QDR of interest to QDR

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants