You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #26's internal/logbuf Cisco-style dedup works in principle ("last message repeated N times") but collapses far less traffic than intended because the dedup comparison key includes Go's log.Printf timestamp prefix — and that timestamp advances each second.
Measurement
Live daemon with dev polling every ~1s (generating ssh: client connected: michael.pursifull (SHA256:...) per connect):
Disk log (--log-file, direct writes, no logbuf interposed):
So the ring compresses about 2:1 in this scenario — only because some seconds receive two connects (which produce byte-identical lines). Connects in different seconds don't match because the YYYY/MM/DD HH:MM:SS prefix differs.
Sample from ring showing the pattern:
2026/04/18 22:38:39 ssh: client connected: michael.pursifull (SHA256:...)
last message repeated 2 times ← these 2 were same-second
2026/04/18 22:38:41 ssh: client connected: michael.pursifull (SHA256:...)
last message repeated 2 times
2026/04/18 22:38:43 ssh: client connected: michael.pursifull (SHA256:...)
last message repeated 2 times
A 100% compressed result would have been one ssh: client connected line followed by last message repeated 1882 times so far.
Root cause
logbuf compares full line bytes. The timestamp prefix is injected by log.Printf upstream of logbuf, so every new second produces a distinct input regardless of payload identity.
Fix direction
Dedup on a content fingerprint that strips the leading YYYY/MM/DD HH:MM:SS (or, more defensibly, whatever log.Flags() is configured to emit). Store the canonical (stripped) form as the comparison key while preserving the original line for display, or re-emit the summary with the latest observed timestamp.
Alternatively, move logbuf to receive post-log-prefix input and attach the timestamp on egress.
Separately worth considering: the disk log (via --log-file) has no dedup at all. Logbuf only sits in front of the ring buffer that marvel daemon logs reads. Whether that's by design ("disk = raw archive") or a gap is a design call.
Summary
PR #26's
internal/logbufCisco-style dedup works in principle ("last message repeated N times") but collapses far less traffic than intended because the dedup comparison key includes Go'slog.Printftimestamp prefix — and that timestamp advances each second.Measurement
Live daemon with dev polling every ~1s (generating
ssh: client connected: michael.pursifull (SHA256:...)per connect):Disk log (
--log-file, direct writes, no logbuf interposed):In-memory ring (
marvel daemon logs, served through logbuf):So the ring compresses about 2:1 in this scenario — only because some seconds receive two connects (which produce byte-identical lines). Connects in different seconds don't match because the
YYYY/MM/DD HH:MM:SSprefix differs.Sample from ring showing the pattern:
A 100% compressed result would have been one
ssh: client connectedline followed bylast message repeated 1882 times so far.Root cause
logbufcompares full line bytes. The timestamp prefix is injected bylog.Printfupstream of logbuf, so every new second produces a distinct input regardless of payload identity.Fix direction
Dedup on a content fingerprint that strips the leading
YYYY/MM/DD HH:MM:SS(or, more defensibly, whateverlog.Flags()is configured to emit). Store the canonical (stripped) form as the comparison key while preserving the original line for display, or re-emit the summary with the latest observed timestamp.Alternatively, move logbuf to receive post-log-prefix input and attach the timestamp on egress.
Related
aae-orc-407llocally.--log-file) has no dedup at all. Logbuf only sits in front of the ring buffer thatmarvel daemon logsreads. Whether that's by design ("disk = raw archive") or a gap is a design call.Environment
--log-max-size 1 --log-max-files 2