Skip to content

fix(analytics): use std::thread for record_analytics_event to avoid postgres Drop abort#1477

Open
bingran-you wants to merge 1 commit intodevfrom
fix/analytics-run-blocking
Open

fix(analytics): use std::thread for record_analytics_event to avoid postgres Drop abort#1477
bingran-you wants to merge 1 commit intodevfrom
fix/analytics-run-blocking

Conversation

@bingran-you
Copy link
Copy Markdown
Contributor

Summary

  • /analytics/track still wrapped AccountStore::record_analytics_event in tokio::task::spawn_blocking, the exact pattern that caused the production signup crash loop fixed in #1454 (root cause: #1451). r2d2 pool recycling drops a sync postgres::Client; its Drop calls Runtime::block_on, which panics "Cannot start a runtime from within a runtime" when the thread still carries a tokio runtime context. Because it fires in a destructor, the panic aborts the worker instead of unwinding.
  • Adds account_store::run_blocking, a std::thread::spawn + tokio::sync::oneshot helper with the same ergonomics as spawn_blocking but without the poisoned runtime context, and switches the /analytics/track handler over to it.
  • Other still-vulnerable callers (service/auth.rs — 40+ sites; service/billing.rs — 5 sites) can migrate incrementally on top of this helper.

Closes #1476. Related: #1451, #1454.

Why this matters

Production (dowhizprod1) and staging (dowhizstaging) had accumulated 110 / 139 PM2 restarts respectively before PR #1454 stabilized the signup path on 2026-04-20 21:06 UTC. Any user hitting /analytics/track under load could still trigger the same abort on a recycled postgres connection. This closes that path.

Test plan

  • cargo check -p scheduler_module --lib — clean, no new warnings.
  • After merge, watch pm2 logs dw_worker on staging for absence of "Cannot start a runtime from within a runtime" panics during analytics traffic.
  • Migrate remaining task::spawn_blocking(move || store.X()) callsites in auth.rs / billing.rs to run_blocking (follow-up PR).

…stgres Drop abort

The `/analytics/track` handler was still wrapping
`AccountStore::record_analytics_event` in `tokio::task::spawn_blocking`. The
same pattern caused the production signup crash loop fixed in PR #1454 (see
issue #1451): r2d2 pool recycling drops a sync `postgres::Client`, whose Drop
impl runs `Runtime::block_on`; if the thread still carries a tokio runtime
context, that panics with "Cannot start a runtime from within a runtime" and
— because it happens in a destructor — aborts the whole worker instead of
unwinding.

Adds `account_store::run_blocking`, a `std::thread::spawn` + `oneshot` helper
meant to be used wherever a closure may (directly or indirectly) drop a sync
postgres client, and switches `/analytics/track` over to it. Follow-up callers
in auth.rs / billing.rs can migrate incrementally.

Refs #1476, #1451, #1454.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dowhiz Ready Ready Preview, Comment Apr 20, 2026 9:30pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breeze:done Breeze finished handling this item

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P1] analytics.rs /analytics/track still uses tokio::spawn_blocking for AccountStore — latent postgres Drop crash

2 participants