Skip to content

Conversation

@Kriys94
Copy link
Contributor

@Kriys94 Kriys94 commented Nov 10, 2025

Explanation

While this PR is focusing on improve traces, I took the opportunity to clean the code around disconnection. This PR:

Changed

  • Improve WebSocket connection lifecycle tracing in BackendWebSocketService (#7101)
    • WebSocket connection duration is now properly reflected in trace span duration instead of only in custom data
    • Trace all disconnections (both manual and unexpected) to provide complete connection lifecycle visibility in traces
    • Omit connectionDuration_ms from disconnection traces when connection never established (onClose without onOpen)
  • Update BackendWebSocketService default exponential backoff options for reconnection (#7101)
    • Increase default reconnectDelay from 500 milliseconds to 10 seconds
    • Increase default maxReconnectDelay from 30 seconds to 60 seconds
  • Simplify WebSocket disconnection code in BackendWebSocketService (#7101)
    • Centralize all disconnection logic in ws.onclose handler for single source of truth
    • Centralize all state changes within #establishConnection method - state transitions only occur in onopen (CONNECTING → CONNECTED) and onclose (any state → DISCONNECTED)
    • Add MANUAL_DISCONNECT_CODE (4999) and MANUAL_DISCONNECT_REASON constants to distinguish manual from unexpected disconnects

Removed

  • Remove BackendWebSocketService Channel Message trace as it provided no useful performance insights (#7101)

Before:
image

After:
image

References

Checklist

  • I've updated the test suite for new or updated code as appropriate
  • I've updated documentation (JSDoc, Markdown, etc.) for new or updated code as appropriate
  • I've communicated my changes to consumers by updating changelogs for packages I've changed, highlighting breaking changes as necessary
  • I've prepared draft pull requests for clients and consumer packages to resolve any breaking changes

Note

Refactors WebSocket lifecycle to centralize disconnection handling with improved tracing and less aggressive reconnection backoff, updating AccountActivityService, tests, and docs accordingly.

  • Backend (BackendWebSocketService):
    • Tracing & Lifecycle: Move all disconnect logic into ws.onclose; add disconnection trace with optional connectionDuration_ms; remove channel message trace.
    • States & Codes: Simplify states to CONNECTING|CONNECTED|DISCONNECTED; add MANUAL_DISCONNECT_CODE (4999) and FORCE_RECONNECT_CODE (4998); disconnect errors now include close code/reason.
    • Reconnection Backoff: Increase defaults reconnectDelay10000ms, maxReconnectDelay60000ms; ensure idempotent scheduling; clear timers on destroy/disable.
    • Connect/Disconnect: Establish state in #establishConnection; resolve connect on manual close during CONNECTING; timeout drives onclose path; remove manual error state.
    • API/Behavior: disconnect() uses manual close; forceReconnection() uses force code; cleanup pending requests/subscriptions on close; adjust notification tracing.
  • AccountActivityService:
    • Treat only DISCONNECTED as down signal; publish tracked chains as down on disconnect; add concise status-change logging.
  • Tests:
    • Update for new states, backoff defaults, codes, and tracing; add cases for manual/forced disconnect, timeout/close idempotency, feature-disable clearing, and trace duration assertions.
  • Docs/Changelog:
    • Document lifecycle tracing improvements, new backoff defaults, manual/forced codes, and removal of channel message trace; README flow wording tweaks.

Written by Cursor Bugbot for commit b2d1def. This will update automatically on new commits. Configure here.

@Kriys94 Kriys94 force-pushed the fix/WebsocketTraces branch 6 times, most recently from 9f2a6b4 to 0894c21 Compare November 12, 2025 10:27
@Kriys94 Kriys94 marked this pull request as ready for review November 12, 2025 10:28
@Kriys94 Kriys94 requested review from a team as code owners November 12, 2025 10:28
@Kriys94 Kriys94 force-pushed the fix/WebsocketTraces branch 3 times, most recently from 249675d to 9f663a7 Compare November 12, 2025 17:45
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Disconnect Undoes Scheduled Reconnection Timer

forceReconnection() fails to reconnect because the scheduled reconnection timer is cancelled. It calls disconnect() which triggers ws.close() with MANUAL_DISCONNECT_CODE, then immediately calls #scheduleReconnect() to schedule a reconnection timer. However, when ws.onclose fires asynchronously, it detects the manual disconnect code and calls #clearTimers(), which cancels the reconnection timer that was just scheduled. This leaves the WebSocket disconnected with no reconnection attempt, breaking the intended disconnect-then-reconnect behavior.

packages/core-backend/src/BackendWebSocketService.ts#L519-L534

if (this.#reconnectTimer) {
log('Reconnect already scheduled, skipping force reconnection');
return;
}
log('Forcing WebSocket reconnection to clean up subscription state');
// Perform controlled disconnect
this.disconnect();
// Schedule reconnection with exponential backoff
this.#scheduleReconnect();
}
/**
* Sends a message through the WebSocket (fire-and-forget, no response expected)

Fix in Cursor Fix in Web


Bug: Forced Reconnection Silently Cancelled

forceReconnection() schedules a reconnect timer that gets immediately cleared when ws.onclose fires. The method calls disconnect() then #scheduleReconnect() synchronously, but disconnect() triggers an asynchronous onclose event. When onclose fires with the manual disconnect code, it calls #clearTimers() which clears the reconnect timer that was just scheduled, and sets #reconnectAttempts to 0 without rescheduling. The forced reconnection never occurs, breaking the documented behavior that states it "schedules reconnection with exponential backoff".

packages/core-backend/src/BackendWebSocketService.ts#L519-L534

if (this.#reconnectTimer) {
log('Reconnect already scheduled, skipping force reconnection');
return;
}
log('Forcing WebSocket reconnection to clean up subscription state');
// Perform controlled disconnect
this.disconnect();
// Schedule reconnection with exponential backoff
this.#scheduleReconnect();
}
/**
* Sends a message through the WebSocket (fire-and-forget, no response expected)

Fix in Cursor Fix in Web


@Kriys94 Kriys94 force-pushed the fix/WebsocketTraces branch from 9f663a7 to 163dffa Compare November 12, 2025 18:00
@Kriys94 Kriys94 force-pushed the fix/WebsocketTraces branch from 163dffa to 03b0cc2 Compare November 12, 2025 18:12
@Kriys94 Kriys94 force-pushed the fix/WebsocketTraces branch from 03b0cc2 to b2d1def Compare November 13, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants