Skip to content

feat: add application-level idle keepalive to detect zombie connections#59

Merged
javi11 merged 1 commit intomainfrom
feat/keepalive-idle-probe
Mar 25, 2026
Merged

feat: add application-level idle keepalive to detect zombie connections#59
javi11 merged 1 commit intomainfrom
feat/keepalive-idle-probe

Conversation

@javi11
Copy link
Copy Markdown
Owner

@javi11 javi11 commented Mar 25, 2026

Summary

  • Adds KeepaliveInterval and KeepaliveCommand fields to Provider for optional application-level idle keepalive
  • When enabled, sends a lightweight NNTP command (DATE, HELP, or CAPABILITIES) periodically on idle connections to detect zombie TCP connections before a real request discovers them
  • On probe failure the connection is closed immediately; runConnSlot reconnects automatically on the next request

Motivation

Network paths (NAT tables, firewalls) can silently drop idle TCP connections while the client-side socket still appears open. Without keepalive, zombie connections are only discovered when a real request times out — causing seconds of latency on the next request. This feature detects dead connections proactively during idle periods.

Implementation notes

The probe is injected as a synthetic *Request through the normal write pipeline so readerLoop maintains FIFO ordering with any other in-flight pipelined requests. The inflightSem slot acquired before the blocking select is owned by the keepalive request; readerLoop releases it at the normal point (after delivering the response), exactly as for real requests.

Configuration

Provider{
    KeepaliveInterval: 45 * time.Second,  // 0 = disabled (default)
    KeepaliveCommand:  "DATE",            // optional; defaults to "DATE"
                                          // use "HELP" or "CAPABILITIES"
                                          // for providers that don't support DATE
}

When SkipPing: true and KeepaliveCommand is empty, keepalive is disabled automatically (no safe probe command is known for that provider).

Test plan

  • TestKeepalive_KeepsConnectionAlive — server responds correctly to DATE; verifies probe fires and connection remains usable for real requests
  • TestKeepalive_DeadConnection — server drops connection on DATE; verifies Run() returns and slot can reconnect
  • make check (lint + race detector) — all pass

🤖 Generated with Claude Code

Add KeepaliveInterval and KeepaliveCommand fields to Provider, enabling
periodic lightweight NNTP probes (DATE/HELP/CAPABILITIES) on idle
connections. This detects zombie TCP connections (NAT table expiry,
silent firewall drops) before a real request discovers them mid-flight.

The probe is injected as a synthetic Request through the normal pipeline
so readerLoop maintains FIFO ordering with any pipelined in-flight
requests. On probe failure the connection is closed immediately and
runConnSlot reconnects on the next request.

- KeepaliveInterval: probe interval (0 = disabled, recommended 30s–60s)
- KeepaliveCommand: defaults to "DATE" (111); use "HELP" (100) or
  "CAPABILITIES" (101) for providers that do not support DATE.
  When SkipPing is true and no explicit command is set, keepalive is
  disabled automatically since no safe default command is known.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@javi11 javi11 merged commit cb2fa19 into main Mar 25, 2026
1 check passed
@javi11 javi11 deleted the feat/keepalive-idle-probe branch March 25, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant