Weak connection protocol — interruption-safe patterns for SSH, deploys, batch jobs, mobile/train work, and any remote operation where the agent session may drop mid-flight.
A Claude Code / agent skill that codifies interruption-safe execution patterns for working with remote servers — atomic writes, nohup-backed long jobs, idempotency, incremental phases, and breadcrumb recovery.
Install this skill if you encounter any of these:
weak connection · weak signal · flaky wifi · flaky network · unstable connection ·
unstable network · mobile hotspot · traveling · on a train · intermittent ssh ·
broken pipe · connection drops · signal drops · interruption-safe · atomic writes ·
nohup ssh · idempotent ops · resume after disconnect · long-running ssh ·
overnight batch job · recovery checkpoints · multi-agent coordination ·
ci/cd interruption · agent crash recovery · partial deploy recovery
The natural first instinct is to put resilience rules in your memory (per-project Claude context) or in a CLAUDE.md file. Both work for the parent agent — but they don't reach subagents.
When the parent agent delegates work via the Agent tool (Explore, general-purpose, custom
subagents), the subagent receives:
- A fresh context window (no parent's memory)
- The task description (what you wrote in the
promptfield) - The skills available in the same skill registry
The subagent does not see the parent's memory files, CLAUDE.md, or conversation history.
So if the parent agent has a memory rule like "always use atomic writes when SSHing", the
subagent has no idea — it'll happily run ssh host "cat > /etc/conf" and corrupt your config
when its own session blips.
A skill is the only persistence mechanism that:
- Reaches subagents — when the subagent's task description matches the skill's trigger description, the subagent invokes the skill and reads its body. This is the primary win.
- Crosses projects — a skill installed via
npx skills add owner/resilient-server-opsworks in any Claude Code project, not just the one that defined it. - Crosses agents — works in Claude Code, Claude Desktop, Cursor, any harness that supports the agent skills protocol.
- Is shareable — can be linked from
skills.sh, distributed via GitHub, included in agent stack templates.
For these reasons, resilience patterns belong in a skill even if you also have them in memory. Memory keeps your parent agent on rails. The skill keeps every other agent in your ecosystem on rails too.
The patterns generalize across these scenarios:
| Scenario | Why it needs the patterns |
|---|---|
| User on mobile / train with weak signal | Connection drops mid-command, agent loses session |
| Long batch jobs (overnight fetches, multi-hour migrations) | SSH pipe times out before completion |
| Multi-agent coordination on shared infrastructure | Need to know what others are doing, avoid trampling |
| CI/CD pipelines that may be canceled or restarted | Re-runs must skip completed work |
| Agent runtime crashes (out-of-memory, OS update reboot) | Agent restarts; server work continues |
| Power loss / VS Code crashes | Same — agent dies, server lives |
| Production deploys | Half-applied changes are catastrophic |
For yourself, locally (no GitHub publish needed):
Clone or copy this directory under .agents/skills/resilient-server-ops/ in your project. Or
to your global skills dir (~/.claude/skills/ on Mac/Linux, %USERPROFILE%\.claude\skills\
on Windows) for it to apply across all projects.
For others / cross-machine (after publishing to GitHub):
npx skills add <github-owner>/resilient-server-opsThis installs the skill into the user's local skill registry. The next time their agent session starts, the skill is available.
There are two ways the skill activates:
The agent invokes the skill when the user prompt or task context matches the description in SKILL.md's frontmatter. Example triggering phrases (illustrative, not exhaustive — agents should also trigger on other phrasings of the same intent):
- "Deploy this to my server"
- "Run this fetch overnight"
- "Kick off a long job"
- "This might take a while"
- "Weak connection / flaky wifi / unstable network"
- "I'm on the train and my signal keeps dropping"
- Any SSH / git push / remote work where mid-flight interruption is plausible
- etc.
The skill also self-recommends ("trigger proactively when issuing remote commands") so even when the user doesn't explicitly mention resilience, the agent should apply patterns when issuing remote work.
If you want the patterns applied to every server op in a session — not just when the agent decides per-task — activate the weak connection protocol explicitly:
- "Activate weak connection protocol" / "weak connection mode on"
- "I'm on a flaky network, apply resilience to every operation until I deactivate"
(The name "weak connection protocol" was chosen over "signal protocol" because "signal" is ambiguous — Unix signals, messaging apps, audio signals. "Weak connection" unambiguously refers to network reliability.)
The agent treats this as a session-level flag and applies the patterns uniformly. To turn it off:
- "Deactivate weak connection protocol" / "weak connection mode off"
- "Connection is stable, stop the resilience overhead"
A casual remark like "connection is OK now" does not deactivate — agents should treat that as a status report, not a rule lift. If you genuinely want to deactivate, use an explicit phrase.
For long-lived sessions across days, recommend recording the active flag in agent memory (e.g., a Claude Code memory note) so it persists across session restarts.
subagent-driven-development— when delegating server work to a subagent, the subagent invokes resilient-server-ops (via its triggering description) and applies the patterns inside its own context. This is the cleanest "delegation + resilience" combo.systematic-debugging— when debugging a partially-applied operation, the recovery checklist (in SKILL.md) tells you what to check first.executing-plans— phase-based plan execution composes naturally with the "incremental phases with disk checkpoints" pattern.
SKILL.md— the skill itself (read by agents when triggered)README.md— this file (read by humans evaluating/installing)
No bundled scripts (yet) — patterns are mostly mental discipline + a few shell idioms small enough to inline. Could grow if heavy automation patterns emerge (e.g., a wrapper script that auto-applies atomic-write semantics).
If this lives in a public GitHub repo: state your license here (MIT / CC0 / whatever). Contribution welcome — open an issue or PR on the source repo.
If this is internal-only, this section is moot.