Skip to content

fix: save marker after Chronicle delivery to prevent data loss on crash#25

Merged
carlosmmatos merged 1 commit intomainfrom
atlas/task-25
Apr 17, 2026
Merged

fix: save marker after Chronicle delivery to prevent data loss on crash#25
carlosmmatos merged 1 commit intomainfrom
atlas/task-25

Conversation

@carlosmmatos
Copy link
Copy Markdown
Contributor

State persistence was in the reader thread, saving the _marker cursor after each fetch cycle regardless of whether indicators had actually been delivered to Chronicle. If the process crashed between fetching and writing, the marker would advance past unsent indicators, creating a silent data gap on restart.

Moves state persistence to the writer thread so the marker is only saved after a batch is successfully delivered. The queue now carries (indicators, marker) tuples so the writer knows which cursor to persist. The writer also stops sending remaining batches when one permanently fails after 30 retries, preventing partial-delivery marker advancement.

Also fixes the cold-start FQL filter — the reader was building _marker:>='1745007000.123'+deleted:false with a float timestamp on cold start. Since _marker is an opaque cursor string, this didn't filter meaningfully and the API returned the entire indicator corpus (~5.5M), making initial_sync_lookback ineffective. Now uses last_updated:>= for numeric timestamps and _marker:>= for real marker strings, with seamless transition during pagination.

State persistence was in the reader thread, saving the marker after
each fetch cycle regardless of whether indicators were delivered to
Chronicle. If the process crashed between fetching and sending, the
marker would advance past unsent indicators, creating a gap on restart.

Moves state persistence to the writer thread so the marker is only
saved after a batch is successfully delivered to Chronicle. The queue
now carries (indicators, marker) tuples so the writer knows which
marker to persist. Also stops the writer from continuing to send
remaining batches when a batch permanently fails after 30 retries.

Additionally fixes the cold-start FQL filter — the reader was using
`_marker:>=` with a float timestamp, which doesn't filter against
opaque marker strings, causing the API to return the full corpus
(~5.5M indicators) regardless of `initial_sync_lookback`. Now uses
`last_updated:>=` for numeric timestamps and `_marker:>=` for real
marker strings.

Excludes config/devel.ini from Docker builds via .dockerignore to
prevent local dev overrides from leaking into container images.
@carlosmmatos carlosmmatos added the fix Bug fixes label Apr 17, 2026
@carlosmmatos carlosmmatos merged commit 323e1cd into main Apr 17, 2026
7 checks passed
@carlosmmatos carlosmmatos deleted the atlas/task-25 branch April 17, 2026 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Bug fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants