fix: do not handle errors during connection #616

hauleth · 2025-02-26T08:20:51Z

This is potential source of a problem, as if there are is error in 2nd or 3rd clause of with then we are leaking connections which in the end may result in the connections exhaustion. Instead we should fail loudly that there is something wrong with the connections. It should also help with debugging potential problems.

chasers · 2025-02-26T15:40:30Z

lib/cluster/strategy/postgres.ex

-      {:noreply, put_in(state.meta, meta)}
-    else
-      reason ->
-        Logger.error(state.topology, "Failed to connect to Postgres: #{inspect(reason)}")


What happens when this fails loudly does it take down the whole app with it?

I would need to check exactly, but in theory, if the failure happen repeatedly, then it can cause whole application to go down.

does it take down the whole app with it?

No, Supervisor will try to restart the Postgres strategy process, and after exhausting the retries, it will just leave it unstarted. But that change doesn’t make much sense anyway, because the node will be out of the cluster in case of a Postgres failure. You probably want to forward already assigned values, send a connect message after a timeout, and handle them more gracefully

abc3 · 2025-03-25T10:40:28Z

leaking connections

In case of failure in the following cases, the node will not handle notifications because there is no retry logic. So yeah, there’s a potential place for a bug, but it doesn’t lead to leaking connections

This is potential source of a problem, as if there are is error in 2nd or 3rd clause of `with` then we are leaking connections which in the end may result in the connections exhaustion. Instead we should fail loudly that there is something wrong with the connections. It should also help with debugging potential problems.

abc3 · 2025-05-26T12:25:56Z

@hauleth ~~fyi supabase/libcluster_postgres#24~~

I've closed that PR because there's already one in progress that adds auto_reconnect. So, to fix the current issue, just need to add auto_reconnect to the notification connection

hauleth requested a review from a team as a code owner February 26, 2025 08:20

hauleth force-pushed the do-not-handle-errors-in-postgres-libcluster-strategy branch from 198a011 to b599f4a Compare February 26, 2025 08:21

chasers reviewed Feb 26, 2025

View reviewed changes

hauleth force-pushed the do-not-handle-errors-in-postgres-libcluster-strategy branch from b599f4a to c33107a Compare February 27, 2025 19:37

hauleth force-pushed the do-not-handle-errors-in-postgres-libcluster-strategy branch 2 times, most recently from 9c386f8 to 4d8320e Compare March 17, 2025 16:00

abc3 mentioned this pull request Mar 25, 2025

Replace custom module for libcluster Postgres strategy with libcluster_postgres #608

Open

hauleth force-pushed the do-not-handle-errors-in-postgres-libcluster-strategy branch 4 times, most recently from a03fbbb to 1642320 Compare May 16, 2025 19:52

hauleth force-pushed the do-not-handle-errors-in-postgres-libcluster-strategy branch from 1642320 to c7b4c06 Compare May 16, 2025 20:26

hauleth closed this Jun 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: do not handle errors during connection #616

fix: do not handle errors during connection #616

Uh oh!

hauleth commented Feb 26, 2025

Uh oh!

chasers Feb 26, 2025

Uh oh!

hauleth Feb 27, 2025

Uh oh!

abc3 Mar 25, 2025

Uh oh!

abc3 commented Mar 25, 2025

Uh oh!

abc3 commented May 26, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

fix: do not handle errors during connection #616

fix: do not handle errors during connection #616

Uh oh!

Conversation

hauleth commented Feb 26, 2025

Uh oh!

chasers Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

hauleth Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

abc3 Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

abc3 commented Mar 25, 2025

Uh oh!

abc3 commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abc3 commented May 26, 2025 •

edited

Loading