Skip to content

Eval dup embeddings#181

Merged
kenforthewin merged 2 commits intomainfrom
eval-dup-embeddings
May 1, 2026
Merged

Eval dup embeddings#181
kenforthewin merged 2 commits intomainfrom
eval-dup-embeddings

Conversation

@kenforthewin
Copy link
Copy Markdown
Owner

No description provided.

bk-ty and others added 2 commits April 30, 2026 10:44
Two bugs caused atoms to be embedded twice on every scheduler tick:

1. pipeline_task.rs called both process_pending_embeddings_due and
   process_pending_tagging_due in the same tick. Both functions call
   enqueue_pipeline_jobs_from_statuses_sync followed by
   process_queued_pipeline_jobs. The second call's ON CONFLICT reset
   jobs already claimed by the first call back to state='pending',
   causing them to be claimed and processed a second time.

   process_pending_embeddings_due already covers atoms with
   (embedding_status='complete' AND tagging_status='pending'), so
   process_pending_tagging_due was purely redundant.

2. enqueue_pipeline_jobs_from_statuses_sync used ON CONFLICT ... DO
   UPDATE SET state='pending', lease_until=NULL unconditionally. Any
   concurrent or racing enqueue call could interrupt an actively-leased
   processing job, regardless of cause.

Fixes:
- Remove the redundant process_pending_tagging_due call from the
  scheduler tick
- Preserve state='processing' and lease_until in the ON CONFLICT
  clause when the existing job has an active (non-expired) lease
@kenforthewin kenforthewin merged commit 3ee3af3 into main May 1, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants