Skip to content

If disk space runs out your active session becomes corrupted #7607

@sam-ulrich1

Description

@sam-ulrich1

Description

Here is a full write up from GPT 5 on what happened and how to patch. Recovered my session.

Post‑mortem: Session would not open after disk ran out of space

Summary

A session could not be opened because two on-disk JSON “part” files were truncated to 0 bytes when the disk filled up mid-session. The server crashed while loading session messages because it attempted to JSON-parse those files. We fixed it by backing up the session data and quarantining the corrupted files so the session loader could proceed.

Impact

  • User-facing: Opening the affected session failed (server logged an error) and the UI could not display the session.
  • Data: The content that should have been written into the two 0‑byte part files is unrecoverable (it never reached disk). Everything else in the session remained intact.

Root cause

  • Disk ran out of space during a write to session “part” storage.
  • Result: two files existed but were 0 bytes, producing Unexpected end of JSON input when parsed.

Technical details (for developers)

  • The TUI startup flow on --continue navigates to the most recent session, which triggers a full session sync:
    • Client: sync.session.sync(sessionID) requests:
      • GET /session/:sessionID
      • GET /session/:sessionID/message?limit=100
      • GET /session/:sessionID/todo
      • GET /session/:sessionID/diff
  • The crash occurred while handling:
    • GET /session/:sessionID/message
  • Server code path:
    • GET /session/:sessionID/messageSession.messages()MessageV2.stream()MessageV2.parts()Storage.read(["part", messageID, partID]).json()
  • Failing operation:
    • JSON parse of a 0‑byte file throws: Unexpected end of JSON input
  • Corrupted artifacts observed:
    • Two 0‑byte JSON files under storage/part/<messageID>/...json for a single message.

Detection / symptoms

  • Server logs showed repeated Unexpected end of JSON input during session load.
  • The access logs still showed requests “completed”, but the handler errored internally (global error handler logged failed).

Resolution (what we did)

  1. Backed up the session’s load-critical artifacts (session metadata, todo/diff, message JSON directory, and the affected part directory) into a timestamped backup folder.
  2. Scanned session message/part JSON to locate invalid files; found exactly two bad files (0 bytes).
  3. Quarantined the corrupted files by moving them into the backup quarantine folder (no deletion, no edits).
  4. Re-scanned: 0 JSON errors, and the session opened successfully.

Why this fix is safe

  • No guessing/rewriting of JSON content.
  • Only corrupted files were removed from the live read path.
  • Everything changed is reversible (files were moved to a backup quarantine location).

Follow-ups / recommendations for developers

  • Make storage writes atomic:
    • Write to *.tmp, fsync, then rename to final name (atomic rename).
  • Detect and tolerate corrupt JSON reads:
    • In Storage.read, catch JSON parse errors and return a structured error that can be handled upstream.
    • In message loading, skip/omit invalid parts rather than failing the entire request.
  • Add integrity tooling:
    • A command like opencode repair-session <id> that backs up, scans, and quarantines invalid artifacts.
  • Improve observability:
    • Include response status code in request logs, so “completed” isn’t ambiguous during failures.

What to tell devs you still need

  • Approximate time disk filled up, and whether the process was killed/restarted.
  • Whether the corruption pattern repeats across other sessions (scan results can be provided without including user paths).

Plugins

No response

OpenCode version

No response

Steps to reproduce

No response

Screenshot and/or share link

No response

Operating System

No response

Terminal

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions