Skip to content

Fix biometrics on Pod 5: cbor2 C extension skips records + empty placeholder records break read loop#46

Open
seanpasino wants to merge 1 commit intothrowaway31265:mainfrom
seanpasino:fix/pod5-cbor2-buffering-biometrics
Open

Fix biometrics on Pod 5: cbor2 C extension skips records + empty placeholder records break read loop#46
seanpasino wants to merge 1 commit intothrowaway31265:mainfrom
seanpasino:fix/pod5-cbor2-buffering-biometrics

Conversation

@seanpasino
Copy link
Copy Markdown

Problem

Biometrics produce no vitals data on Pod 5 due to two separate bugs in load_raw_files.py and stream.py.


Bug 1: cbor2.load(f) skips nearly every record

When the _cbor2 C extension is installed (the default when cbor2 is pip-installed with a compiler available), cbor2.load(f) reads files in internal 4096-byte chunks. Each call advances f.tell() by 4096 bytes regardless of how large the actual CBOR record was.

RAW file records are typically 17–5000 bytes each. This means only one record per 4096-byte block is decoded — all others are silently skipped. On Pod 5, piezo-dual records are ~2700 bytes, so the stream sees almost no data. On Pod 3/4 with larger records (~5000 bytes across 4 sensors), the effect is smaller but accuracy is still degraded.

Before (both files):

row = cbor2.load(f)          # f.tell() jumps by 4096 regardless of record size
decoded_data = cbor2.loads(row['data'])

After:

data_bytes = _read_raw_record(f)   # f.tell() advances by exact record size
decoded_data = cbor2.loads(data_bytes)

_read_raw_record(f) manually parses the outer {seq: uint, data: bytes} CBOR wrapper byte-by-byte using f.read(), keeping the file position accurate.


Bug 2: Empty placeholder records raise EOFError mid-file, stopping the loop early

Pod 5 firmware writes placeholder records with data = b'' (CBOR byte 0x40 = empty byte string) as sequence number markers between real data records. Calling cbor2.loads(b'') raises CBORDecodeEOF, which is a subclass of EOFError.

Both files catch EOFError to detect end-of-file and break the read loop. So any placeholder record mid-file terminates the entire read — even with megabytes of valid data remaining.

Fix: _read_raw_record() returns None for empty data; callers continue.


Additional fix in stream.py

The original except Exception handler called break, leaving f.tell() at an unknown position. Changed to seek(last_pos) before breaking so the next poll cycle starts from the last known good position.


Files changed

  • biometrics/load_raw_files.py — added import struct, added _read_raw_record(), replaced cbor2.load() call
  • biometrics/stream/stream.py — added import struct, added _read_raw_record(), rewrote follow_latest_file() read loop

Tested on

  • Device: Eight Sleep Pod 5
  • cbor2: 5.6.5 with _cbor2 C extension active
  • Python: 3.10

Result: 846 vitals rows (heart rate, HRV, breathing rate) and 400 movement rows recorded in a single night. Stream runs stably with no OOM kills or error floods.


Backward compatibility

The fix is fully compatible with Pod 3/4. The outer {seq, data} CBOR format is identical across all pod versions. Pod 3/4 records with left2/right2 fields are unaffected — inner record decoding is unchanged. Pod 3/4 may also see improved accuracy as more records are now correctly decoded.

…empty placeholder records break loop

Two bugs combine to produce zero biometric data on Pod 5 (and degrade
accuracy on Pod 3/4):

Bug 1: cbor2 C extension reads in 4096-byte chunks

When the _cbor2 C extension is installed, cbor2.load(f) advances f.tell()
by 4096 bytes per call regardless of actual record size. RAW file records
are 17-5000 bytes each, so only one record per 4096-byte block is ever
decoded — the rest are silently skipped. On Pod 5 with ~2700-byte
piezo-dual records, this skips the majority of data.

Fix: replace cbor2.load(f) with _read_raw_record(f), a manual parser that
uses f.read() byte-by-byte to parse the outer {seq: uint, data: bytes}
CBOR wrapper, keeping f.tell() accurate after every record.

Bug 2: Empty placeholder records raise EOFError mid-file

Pod 5 firmware writes placeholder records with data=b'' (CBOR 0x40 =
empty byte string) as sequence number markers between real records.
cbor2.loads(b'') raises CBORDecodeEOF, which is a subclass of EOFError.
Since both files catch EOFError to detect end-of-file and break the loop,
any placeholder record mid-file terminates the entire read early — even
with megabytes of valid data remaining.

Fix: _read_raw_record() returns None for empty data; callers continue.

Additionally, the original stream.py error handler called break on any
exception, leaving the file position at an unknown offset. Changed to
seek back to last_pos before breaking so recovery is possible.

Tested on Pod 5, cbor2 5.6.5 with _cbor2 C extension, Python 3.10.
Result: 846 vitals rows (HR, HRV, breathing rate) and 400 movement rows
recorded in a single night. Stream runs stably with no OOM kills.
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 14, 2026

Someone is attempting to deploy a commit to the david's projects Team on Vercel.

A member of the Team first needs to authorize it.

ng added a commit to sleepypod/core that referenced this pull request Mar 14, 2026
The cbor2 C extension (_cbor2) reads files in internal 4096-byte chunks,
advancing f.tell() by 4096 regardless of actual record size. Since RAW
records are 17-5000 bytes, cbor2.load(f) silently skips most records.
On Pod 5, piezo-dual records are ~2700 bytes so nearly every other record
was lost, severely degrading vitals accuracy.

Additionally, Pod 5 firmware writes empty placeholder records (data=b'')
as sequence markers. cbor2.loads(b'') raises CBORDecodeEOF (a subclass of
EOFError), which the read loop caught as end-of-file, terminating reads
mid-file with valid data remaining.

Fix: Replace cbor2.load(f) with _read_raw_record(f) that manually parses
the outer {seq, data} CBOR wrapper byte-by-byte, keeping f.tell() accurate.
Empty placeholders return None and are skipped. On read errors, seek back
to last known good position instead of breaking the loop.

Applied to both piezo-processor and sleep-detector modules.

Ref: throwaway31265/free-sleep#46

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ng added a commit to sleepypod/core that referenced this pull request Mar 14, 2026
…covery

- Extract _read_raw_record into modules/common/cbor_raw.py to eliminate
  duplication between piezo-processor and sleep-detector
- Add file handle cleanup on shutdown to piezo-processor (was already
  present in sleep-detector)
- Add consecutive-failure counter to both modules: after 5 failures at
  the same file position, skip forward 1 byte to resync past corrupt
  data instead of retrying forever
- Narrow exception handler from bare `except Exception` to
  `except (ValueError, cbor2.CBORDecodeError, OSError)` so only
  parsing/IO errors are retried

Original CBOR fix ported from throwaway31265/free-sleep#46
by @seanpasino — thank you!

Co-Authored-By: seanpasino <seanpasino@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ng added a commit to sleepypod/core that referenced this pull request Mar 14, 2026
The cbor2 C extension (_cbor2) reads files in internal 4096-byte chunks,
advancing f.tell() by 4096 regardless of actual record size. Since RAW
records are 17-5000 bytes, cbor2.load(f) silently skips most records.
On Pod 5, piezo-dual records are ~2700 bytes so nearly every other record
was lost, severely degrading vitals accuracy.

Additionally, Pod 5 firmware writes empty placeholder records (data=b'')
as sequence markers. cbor2.loads(b'') raises CBORDecodeEOF (a subclass of
EOFError), which the read loop caught as end-of-file, terminating reads
mid-file with valid data remaining.

Fix: Replace cbor2.load(f) with _read_raw_record(f) that manually parses
the outer {seq, data} CBOR wrapper byte-by-byte, keeping f.tell() accurate.
Empty placeholders return None and are skipped. On read errors, seek back
to last known good position instead of breaking the loop.

Applied to both piezo-processor and sleep-detector modules.

Ref: throwaway31265/free-sleep#46
ng added a commit to sleepypod/core that referenced this pull request Mar 14, 2026
…covery

- Extract _read_raw_record into modules/common/cbor_raw.py to eliminate
  duplication between piezo-processor and sleep-detector
- Add file handle cleanup on shutdown to piezo-processor (was already
  present in sleep-detector)
- Add consecutive-failure counter to both modules: after 5 failures at
  the same file position, skip forward 1 byte to resync past corrupt
  data instead of retrying forever
- Narrow exception handler from bare `except Exception` to
  `except (ValueError, cbor2.CBORDecodeError, OSError)` so only
  parsing/IO errors are retried

Original CBOR fix ported from throwaway31265/free-sleep#46
by @seanpasino — thank you!

Co-Authored-By: seanpasino <seanpasino@users.noreply.github.com>
ng added a commit to sleepypod/core that referenced this pull request Mar 14, 2026
## Summary

- **cbor2.load(f) skips records**: The C extension reads in 4096-byte
internal chunks, advancing `f.tell()` past valid records. Pod 5 piezo
records are ~2700 bytes, so nearly every other record was silently
dropped — severely degrading HR/HRV/breathing accuracy.
- **Empty placeholder records break the read loop**: Pod 5 firmware
writes `data=b''` markers. `cbor2.loads(b'')` raises `CBORDecodeEOF`
(subclass of `EOFError`), which the loop caught as end-of-file, stopping
reads mid-file with megabytes of valid data remaining.
- **Error recovery**: On parse errors, seek back to last known good
position instead of breaking the loop entirely.

Fix applied to both `piezo-processor` and `sleep-detector` modules.
Ported from
[free-sleep#46](throwaway31265/free-sleep#46).

## Test plan

- [ ] Deploy to Pod 5 and verify vitals rows are produced (~846+ per
night vs near-zero before)
- [ ] Verify movement rows are produced by sleep-detector
- [ ] Confirm backward compatibility on Pod 3/4 (same outer CBOR format)
- [ ] Check systemd logs for clean startup with no error floods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* More robust recovery from corrupt or partial RAW records to prevent
stalls and resume processing.

* **Improvements**
* Better handling of placeholder/empty records so ingestion continues
without gaps.
* Improved file-position tracking and shutdown cleanup to reduce data
loss and improve stability.

* **New Features**
* Automatic skipping of repeatedly failing bytes after repeated decode
errors to keep pipelines moving.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: seanpasino <seanpasino@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant