-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description
Here is a full write up from GPT 5 on what happened and how to patch. Recovered my session.
Post‑mortem: Session would not open after disk ran out of space
Summary
A session could not be opened because two on-disk JSON “part” files were truncated to 0 bytes when the disk filled up mid-session. The server crashed while loading session messages because it attempted to JSON-parse those files. We fixed it by backing up the session data and quarantining the corrupted files so the session loader could proceed.
Impact
- User-facing: Opening the affected session failed (server logged an error) and the UI could not display the session.
- Data: The content that should have been written into the two 0‑byte part files is unrecoverable (it never reached disk). Everything else in the session remained intact.
Root cause
- Disk ran out of space during a write to session “part” storage.
- Result: two files existed but were 0 bytes, producing
Unexpected end of JSON inputwhen parsed.
Technical details (for developers)
- The TUI startup flow on
--continuenavigates to the most recent session, which triggers a full session sync:- Client:
sync.session.sync(sessionID)requests:GET /session/:sessionIDGET /session/:sessionID/message?limit=100GET /session/:sessionID/todoGET /session/:sessionID/diff
- Client:
- The crash occurred while handling:
GET /session/:sessionID/message
- Server code path:
GET /session/:sessionID/message→Session.messages()→MessageV2.stream()→MessageV2.parts()→Storage.read(["part", messageID, partID]).json()
- Failing operation:
- JSON parse of a 0‑byte file throws:
Unexpected end of JSON input
- JSON parse of a 0‑byte file throws:
- Corrupted artifacts observed:
- Two 0‑byte JSON files under
storage/part/<messageID>/...jsonfor a single message.
- Two 0‑byte JSON files under
Detection / symptoms
- Server logs showed repeated
Unexpected end of JSON inputduring session load. - The access logs still showed requests “completed”, but the handler errored internally (global error handler logged
failed).
Resolution (what we did)
- Backed up the session’s load-critical artifacts (session metadata, todo/diff, message JSON directory, and the affected part directory) into a timestamped backup folder.
- Scanned session message/part JSON to locate invalid files; found exactly two bad files (0 bytes).
- Quarantined the corrupted files by moving them into the backup quarantine folder (no deletion, no edits).
- Re-scanned: 0 JSON errors, and the session opened successfully.
Why this fix is safe
- No guessing/rewriting of JSON content.
- Only corrupted files were removed from the live read path.
- Everything changed is reversible (files were moved to a backup quarantine location).
Follow-ups / recommendations for developers
- Make storage writes atomic:
- Write to
*.tmp,fsync, then rename to final name (atomic rename).
- Write to
- Detect and tolerate corrupt JSON reads:
- In
Storage.read, catch JSON parse errors and return a structured error that can be handled upstream. - In message loading, skip/omit invalid parts rather than failing the entire request.
- In
- Add integrity tooling:
- A command like
opencode repair-session <id>that backs up, scans, and quarantines invalid artifacts.
- A command like
- Improve observability:
- Include response status code in request logs, so “completed” isn’t ambiguous during failures.
What to tell devs you still need
- Approximate time disk filled up, and whether the process was killed/restarted.
- Whether the corruption pattern repeats across other sessions (scan results can be provided without including user paths).
Plugins
No response
OpenCode version
No response
Steps to reproduce
No response
Screenshot and/or share link
No response
Operating System
No response
Terminal
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working