littlebearapps · Nathan Schram (nathanschram) · Mar 22, 2026 · Mar 22, 2026 · Mar 22, 2026 · Mar 22, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -35,6 +35,7 @@ jobs:
             do_sync: true
             command: uv run --no-sync ty check --warn invalid-argument-type --warn unresolved-attribute --warn invalid-assignment --warn not-subscriptable src tests
             sync_args: --no-install-project
+            allow_failure: true  # ty has pre-existing warnings; informational only
           - task: lockfile
             do_sync: false
             command: uv lock --check
@@ -60,6 +61,7 @@ jobs:
 
       - name: Run check
         run: ${{ matrix.command }}
+        continue-on-error: ${{ matrix.allow_failure || false }}
 
   pytest:
     name: pytest (Python ${{ matrix.python-version }})

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -23,8 +23,11 @@
   - buttons use real `request_id` from `pending_control_requests` for direct callback routing
   - 5-minute safety timeout cleans up stale held requests
 - suppress stall auto-cancel when CPU is active — extended thinking phases produce no JSONL events but the process is alive and busy; `is_cpu_active()` check prevents false-positive kills [#114](https://github.com/littlebearapps/untether/issues/114)
+- fix stall notification suppression when main process sleeping — CPU-active suppression now checks `process_state`; when main process is sleeping (state=S) but children are CPU-active (hung Bash tool), notifications fire instead of being suppressed; stall message now shows tool name ("Bash tool may be stuck") instead of generic "session may be stuck" [#168](https://github.com/littlebearapps/untether/issues/168)
 - suppress redundant cost footer on error runs — diagnostic context line already contains cost data, footer no longer duplicates it [#120](https://github.com/littlebearapps/untether/issues/120)
 - clarify /config default labels and remove redundant "Works with" lines [#119](https://github.com/littlebearapps/untether/issues/119)
+- Codex: always pass `--ask-for-approval` in headless mode — default to `never` (auto-approve all) so Codex never blocks on terminal input; `safe` permission mode still uses `untrusted` [#184](https://github.com/littlebearapps/untether/issues/184)
+- OpenCode: surface unsupported JSONL event types as visible Telegram warnings instead of silently dropping them — prevents silent 5-minute hangs when OpenCode emits new event types (e.g. `question`, `permission`) [#183](https://github.com/littlebearapps/untether/issues/183)
 
 ### changes
 
@@ -53,6 +56,10 @@
   - both engines show "Agent controls" section on `/config` home page with engine-specific labels
 - suppress stall Telegram notifications when CPU-active; heartbeat re-render keeps elapsed time counter ticking during extended thinking phases [#121](https://github.com/littlebearapps/untether/issues/121)
 - temporary debug logging for hold-open callback routing — will be removed after dogfooding confirms [#118](https://github.com/littlebearapps/untether/issues/118) is resolved
+- auto-continue mitigation for Claude Code bug — when Claude Code exits after receiving tool results without processing them (bugs [#34142](https://github.com/anthropics/claude-code/issues/34142), [#30333](https://github.com/anthropics/claude-code/issues/30333)), Untether detects via `last_event_type=user` and auto-resumes the session [#167](https://github.com/littlebearapps/untether/issues/167)
+  - `AutoContinueSettings` with `enabled` (default true) and `max_retries` (default 1) in `[auto_continue]` config section
+  - detection based on protocol invariant: normal sessions always end with `last_event_type=result`
+  - sends "⚠️ Auto-continuing — Claude stopped before processing tool results" notification before resuming
 
 ### tests
 
@@ -70,6 +77,8 @@
 - hold-open outline flow: new tests for hold-open path, real request_id buttons, pending cleanup, approval routing [#114](https://github.com/littlebearapps/untether/issues/114)
 - stall suppression: tests for CPU-active auto-cancel, notification suppression when cpu_active=True, notification fires when cpu_active=False [#114](https://github.com/littlebearapps/untether/issues/114), [#121](https://github.com/littlebearapps/untether/issues/121)
 - cost footer: tests for suppression on error runs, display on success runs [#120](https://github.com/littlebearapps/untether/issues/120)
+- 10 new auto-continue tests: detection function (bug scenario, non-claude engine, cancelled session, normal result, no resume, max retries) + settings validation (defaults, bounds) [#167](https://github.com/littlebearapps/untether/issues/167)
+- 2 new stall sleeping-process tests: notification not suppressed when main process sleeping (state=S), stall message includes tool name [#168](https://github.com/littlebearapps/untether/issues/168)
 
 ### docs
 

diff --git a/docs/how-to/troubleshooting.md b/docs/how-to/troubleshooting.md
@@ -87,6 +87,20 @@ Run `untether doctor` to see which engines are detected.
 3. Check `debug.log` — the engine may have errored silently
 4. Verify the engine works standalone: run `codex "hello"` (or equivalent) directly in a terminal
 
+## Engine hangs in headless mode
+
+**Symptoms:** The engine starts but produces no output, eventually triggering stall warnings. Common with Codex and OpenCode when the engine needs user input (approval or question) but has no terminal to display it.
+
+### Codex: approval hang
+
+Codex may block waiting for terminal approval in headless mode if no `--ask-for-approval` flag is passed. **Fix:** upgrade to Untether v0.35.0+ which always passes `--ask-for-approval never` (or `untrusted` in safe permission mode). Older versions may not pass this flag, causing Codex to use its default terminal-based approval flow.
+
+### OpenCode: unsupported event warning
+
+If OpenCode emits a JSONL event type that Untether doesn't recognise (e.g. a `question` or `permission` event from a newer OpenCode version), Untether v0.35.0+ shows a visible warning in Telegram: "opencode emitted unsupported event: {type}". In older versions, these events were silently dropped, leaving the user with no feedback until the stall watchdog fired.
+
+If you see this warning, check for an Untether update that adds support for the new event type. OpenCode's `run` command auto-denies questions via permission rules, so this should be rare — it most likely indicates an OpenCode protocol change.
+
 ## Stall warnings
 
 **Symptoms:** Telegram shows "⏳ No progress for X min — session may be stuck" or "⏳ MCP tool running: server-name (X min)".
@@ -106,6 +120,8 @@ The stall watchdog monitors engine subprocesses for periods of inactivity (no JS
 
 **If the warning says "CPU active, no new events"**, the process is using CPU but hasn't produced any new JSONL events for 3+ stall checks. This can happen when Claude Code is stuck in a long API call, extended thinking, or an internal retry loop. Use `/cancel` if the silence persists.
 
+**If the warning says "X tool may be stuck (N min, process waiting)"**, Claude Code's main process is sleeping while waiting for a child process (e.g. a Bash command running `curl` or a long build). The CPU activity shown in the diagnostics is from the child process, not from Claude thinking. Common cause: a network request to a slow or unresponsive API endpoint. Use `/cancel` and resume, asking Claude to skip the hung command — or wait if the command is legitimately long-running.
+
 **If the warning says "session may be stuck"**, the process may genuinely be stalled. Check:
 
 1. Look at the diagnostics in the message — CPU active, TCP connections, RSS

diff --git a/src/untether/runner_bridge.py b/src/untether/runner_bridge.py
@@ -134,6 +134,49 @@ def _load_watchdog_settings():
         return None
 
 
+def _load_auto_continue_settings():
+    """Load auto-continue settings from config, returning defaults if unavailable."""
+    try:
+        from .settings import AutoContinueSettings, load_settings_if_exists
+
+        result = load_settings_if_exists()
+        if result is None:
+            return AutoContinueSettings()
+        settings, _ = result
+        return settings.auto_continue
+    except Exception:  # noqa: BLE001
+        logger.debug("auto_continue_settings.load_failed", exc_info=True)
+        from .settings import AutoContinueSettings
+
+        return AutoContinueSettings()
+
+
+def _should_auto_continue(
+    *,
+    last_event_type: str | None,
+    engine: str,
+    cancelled: bool,
+    resume_value: str | None,
+    auto_continued_count: int,
+    max_retries: int,
+) -> bool:
+    """Detect Claude Code silent session termination bug (#34142, #30333).
+
+    Returns True when the last raw JSONL event was a tool_result ("user")
+    meaning Claude never got a turn to process the results before the CLI
+    exited.
+    """
+    if cancelled:
+        return False
+    if engine != "claude":
+        return False
+    if last_event_type != "user":
+        return False
+    if not resume_value:
+        return False
+    return auto_continued_count < max_retries
+
+
 _DEFAULT_PREAMBLE = (
     "[Untether] You are running via Untether, a Telegram bridge for coding agents. "
     "The user is interacting through Telegram on a mobile device.\n\n"
@@ -831,12 +874,16 @@ async def _stall_monitor(self) -> None:
             # (extended thinking, background agents). Instead, trigger a
             # heartbeat re-render so the elapsed time counter keeps ticking.
             #
-            # Exception: if the ring buffer has been frozen for 3+ checks,
+            # Exception 1: if the ring buffer has been frozen for 3+ checks,
             # the process is likely stuck (retry loop, hung API call, dead
             # thinking) — escalate to a notification despite CPU activity.
+            # Exception 2: if the main process is sleeping (state=S), CPU
+            # activity is from child processes (hung Bash tool, stuck curl),
+            # not from Claude doing extended thinking — notify the user.
             _FROZEN_ESCALATION_THRESHOLD = 3
             frozen_escalate = self._frozen_ring_count >= _FROZEN_ESCALATION_THRESHOLD
-            if cpu_active is True and not frozen_escalate:
+            main_sleeping = diag is not None and diag.state == "S"
+            if cpu_active is True and not frozen_escalate and not main_sleeping:
                 logger.info(
                     "progress_edits.stall_suppressed_notification",
                     channel_id=self.channel_id,
@@ -886,10 +933,30 @@ async def _stall_monitor(self) -> None:
                 elif mcp_server is not None:
                     parts = [f"⏳ MCP tool running: {mcp_server} ({mins} min)"]
                 else:
-                    parts = [f"⏳ No progress for {mins} min"]
+                    # Extract tool name from last running action for
+                    # actionable stall messages ("Bash tool may be stuck"
+                    # instead of generic "session may be stuck").
+                    _tool_name = None
+                    if last_action:
+                        for _prefix in ("tool:", "note:"):
+                            if last_action.startswith(_prefix):
+                                _rest = last_action[len(_prefix) :]
+                                _tool_name = _rest.split(" ", 1)[0].split(":", 1)[0]
+                                break
+                    if _tool_name and main_sleeping:
+                        parts = [
+                            f"⏳ {_tool_name} tool may be stuck ({mins} min, process waiting)"
+                        ]
+                    else:
+                        parts = [f"⏳ No progress for {mins} min"]
                 if self._stall_warn_count > 1:
                     parts[0] += f" (warned {self._stall_warn_count}x)"
-                if not mcp_hung and not frozen_escalate and mcp_server is None:
+                if (
+                    not mcp_hung
+                    and not frozen_escalate
+                    and mcp_server is None
+                    and not (_tool_name and main_sleeping)
+                ):
                     parts.append("— session may be stuck.")
                 if last_action:
                     parts.append(f"Last: {last_action}")
@@ -1547,6 +1614,7 @@ async def handle_message(
     on_resume_failed: Callable[[ResumeToken], Awaitable[None]] | None = None,
     progress_ref: MessageRef | None = None,
     clock: Callable[[], float] = time.monotonic,
+    _auto_continued_count: int = 0,
 ) -> None:
     logger.info(
         "handle.incoming",
@@ -1750,6 +1818,68 @@ async def run_edits() -> None:
     run_ok = completed.ok
     run_error = completed.error
 
+    # --- Auto-continue: mitigate Claude Code bug #34142/#30333 ---
+    # When Claude Code's turn state machine incorrectly ends a session
+    # after receiving tool results (last JSONL event is "user" type),
+    # auto-resume so the user doesn't have to manually continue.
+    ac_settings = _load_auto_continue_settings()
+    _ac_resume = completed.resume or outcome.resume
+    _ac_last_event = edits.stream.last_event_type if edits.stream else None
+    if ac_settings.enabled and _should_auto_continue(
+        last_event_type=_ac_last_event,
+        engine=runner.engine,
+        cancelled=outcome.cancelled,
+        resume_value=_ac_resume.value if _ac_resume else None,
+        auto_continued_count=_auto_continued_count,
+        max_retries=ac_settings.max_retries,
+    ):
+        logger.warning(
+            "session.auto_continue",
+            session_id=_ac_resume.value if _ac_resume else None,
+            engine=runner.engine,
+            last_event_type=_ac_last_event,
+            attempt=_auto_continued_count + 1,
+            max_retries=ac_settings.max_retries,
+        )
+        notice = (
+            "\u26a0\ufe0f Auto-continuing \u2014 "
+            "Claude stopped before processing tool results"
+        )
+        if _auto_continued_count > 0:
+            notice += f" (attempt {_auto_continued_count + 1})"
+        notice_msg = RenderedMessage(text=notice, extra={})
+        await cfg.transport.send(
+            channel_id=incoming.channel_id,
+            message=notice_msg,
+            options=SendOptions(
+                reply_to=user_ref,
+                notify=True,
+                thread_id=incoming.thread_id,
+            ),
+        )
+        await handle_message(
+            cfg,
+            runner=runner,
+            incoming=IncomingMessage(
+                channel_id=incoming.channel_id,
+                message_id=incoming.message_id,
+                text="continue",
+                reply_to=incoming.reply_to,
+                thread_id=incoming.thread_id,
+            ),
+            resume_token=_ac_resume,
+            context=context,
+            context_line=context_line,
+            strip_resume_line=strip_resume_line,
+            running_tasks=running_tasks,
+            on_thread_known=on_thread_known,
+            on_resume_failed=on_resume_failed,
+            clock=clock,
+            _auto_continued_count=_auto_continued_count + 1,
+        )
+        return
+    # --- End auto-continue ---
+
     final_answer = completed.answer
 
     # If there's a plan outline stored in a synthetic warning action,

diff --git a/src/untether/runners/codex.py b/src/untether/runners/codex.py
@@ -500,6 +500,8 @@ def build_args(
                 )
         if run_options is not None and run_options.permission_mode == "safe":
             args.extend(["--ask-for-approval", "untrusted"])
+        else:
+            args.extend(["--ask-for-approval", "never"])
         args.extend(
             [
                 "exec",

diff --git a/src/untether/runners/opencode.py b/src/untether/runners/opencode.py
@@ -13,6 +13,7 @@
 
 from __future__ import annotations
 
+import json
 import re
 from dataclasses import dataclass, field
 from pathlib import Path
@@ -55,6 +56,23 @@
 )
 
 
+def _extract_event_type(raw: str) -> str | None:
+    """Extract the ``type`` field from raw JSON for diagnostics.
+
+    Used when msgspec raises DecodeError (unrecognised event type) to provide
+    visible feedback instead of silently dropping the event.
+    """
+    try:
+        obj = json.loads(raw)
+        if isinstance(obj, dict):
+            t = obj.get("type")
+            if isinstance(t, str):
+                return t
+    except (json.JSONDecodeError, ValueError):
+        pass
+    return None
+
+
 @dataclass(slots=True)
 class OpenCodeStreamState:
     """State tracked during OpenCode JSONL streaming."""
@@ -494,14 +512,30 @@ def decode_error_events(
         state: OpenCodeStreamState,
     ) -> list[UntetherEvent]:
         if isinstance(error, msgspec.DecodeError):
+            event_type = _extract_event_type(raw)
+            if event_type:
+                self.get_logger().warning(
+                    "opencode.event.unsupported",
+                    event_type=event_type,
+                    tag=self.tag(),
+                )
+                return [
+                    self.note_event(
+                        f"opencode emitted unsupported event: {event_type}",
+                        state=state,
+                    )
+                ]
             self.get_logger().warning(
                 "jsonl.msgspec.invalid",
                 tag=self.tag(),
                 error=str(error),
                 error_type=error.__class__.__name__,
             )
             return []
-        return super().decode_error_events(
+        # Explicit parent ref: zero-arg super() breaks in @dataclass(slots=True)
+        # on Python <3.14 because the __class__ cell references the pre-slot class.
+        return JsonlSubprocessRunner.decode_error_events(
+            self,
             raw=raw,
             line=line,
             error=error,

diff --git a/src/untether/settings.py b/src/untether/settings.py
@@ -156,6 +156,17 @@ class PreambleSettings(BaseModel):
     text: str | None = None
 
 
+class AutoContinueSettings(BaseModel):
+    """Mitigate Claude Code bug #34142/#30333: session exits after receiving
+    tool results without letting Claude process them.  When detected, Untether
+    auto-resumes the session so the user doesn't have to manually continue."""
+
+    model_config = ConfigDict(extra="forbid", str_strip_whitespace=True)
+
+    enabled: bool = True
+    max_retries: int = Field(default=1, ge=0, le=3)
+
+
 class WatchdogSettings(BaseModel):
     model_config = ConfigDict(extra="forbid", str_strip_whitespace=True)
 
@@ -196,6 +207,7 @@ class UntetherSettings(BaseSettings):
     preamble: PreambleSettings = Field(default_factory=PreambleSettings)
     progress: ProgressSettings = Field(default_factory=ProgressSettings)
     watchdog: WatchdogSettings = Field(default_factory=WatchdogSettings)
+    auto_continue: AutoContinueSettings = Field(default_factory=AutoContinueSettings)
 
     @model_validator(mode="before")
     @classmethod

diff --git a/tests/test_build_args.py b/tests/test_build_args.py
@@ -173,13 +173,26 @@ def test_permission_mode_safe(self) -> None:
         # Must come before "exec" (top-level flag, not exec subcommand flag)
         assert idx < args.index("exec")
 
-    def test_permission_mode_none_no_approval_flag(self) -> None:
+    def test_permission_mode_none_defaults_to_never(self) -> None:
         runner = self._runner()
         state = runner.new_state("hello", None)
         opts = RunOptions(permission_mode=None)
         with patch("untether.runners.codex.get_run_options", return_value=opts):
             args = runner.build_args("hello", None, state=state)
-        assert "--ask-for-approval" not in args
+        assert "--ask-for-approval" in args
+        idx = args.index("--ask-for-approval")
+        assert args[idx + 1] == "never"
+        assert idx < args.index("exec")
+
+    def test_run_options_none_defaults_to_never(self) -> None:
+        """When run_options is None (no /config overrides), default to never."""
+        runner = self._runner()
+        state = runner.new_state("hello", None)
+        args = runner.build_args("hello", None, state=state)
+        assert "--ask-for-approval" in args
+        idx = args.index("--ask-for-approval")
+        assert args[idx + 1] == "never"
+        assert idx < args.index("exec")
 
 
 # ---------------------------------------------------------------------------