Conversation
There was a problem hiding this comment.
Pull request overview
This PR applies a set of small correctness and typing fixes across contextweaver’s utilities, routing, configuration, context manager, and store protocols to improve mypy compliance and reduce API duplication.
Changes:
- Fix mypy/typing issues by initializing
_doc_id_to_idxinRouter.__init__and narrowingContextPolicy.ttl_behaviorto aLiteral. - Remove unreachable guard code in
_utils.jaccard()and simplify theArtifactStoreProtocol by dropping the duplicatemetadata()requirement. - Update
ContextManager.add_fact()to use an instance counter instead oflen(fact_store.all())for ID generation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/contextweaver/routing/router.py |
Initializes _doc_id_to_idx in __init__ and removes a duplicate annotation to satisfy strict mypy. |
src/contextweaver/protocols.py |
Removes metadata() from ArtifactStore Protocol to avoid forcing duplicate implementations. |
src/contextweaver/context/manager.py |
Switches fact ID generation to a monotonic instance counter. |
src/contextweaver/config.py |
Narrows ttl_behavior to Literal["drop", "warn"] for stronger static checking. |
src/contextweaver/_utils.py |
Removes unreachable code in jaccard() implementation. |
| self._scoring = scoring_config or ScoringConfig() | ||
| self._estimator: TokenEstimator = estimator or CharDivFourEstimator() | ||
| self._hook: EventHook = hook or NoOpHook() | ||
| self._fact_counter: int = 0 |
There was a problem hiding this comment.
Initializing _fact_counter to 0 can reintroduce fact_id collisions when a ContextManager is created with a pre-populated fact_store (e.g., restored from InMemoryFactStore.from_dict() or injected via StoreBundle). In that case, subsequent add_fact() calls can generate IDs that already exist and silently overwrite facts (since InMemoryFactStore.put() replaces by fact_id). Consider seeding _fact_counter from the existing store (e.g., max numeric suffix seen) or switching to a collision-resistant ID scheme (uuid/monotonic timestamp).
| self._fact_counter += 1 | ||
| fact_id = f"fact:{key}:{self._fact_counter}" |
There was a problem hiding this comment.
fact_id = f"fact:{key}:{self._fact_counter}" relies on an in-memory counter that currently resets on new ContextManager instances. If the manager is constructed with an existing fact_store, this can generate duplicate fact_ids and overwrite existing facts in put(). Either derive the next counter value from the store on init, or generate IDs independently of process lifetime (e.g., uuid) to make restores safe.
dgenio
left a comment
There was a problem hiding this comment.
Overall: LGTM with minor comments
Clean, well-scoped housekeeping PR. All five fixes are correct and improve the codebase:
- Fix 1 (
jaccarddead code): Correct — theif not unionguard is unreachable after theif not a and not bearly return. - Fix 2 (
_doc_id_to_idxinit): Correct — declares the attribute in__init__as required by mypy strict. - Fix 3 (
ttl_behaviorLiteral): Correct — tightens the type to match the two documented valid values. - Fix 4 (fact ID counter): Fixes a real collision bug on delete-then-add. One follow-up concern about rehydration from a pre-populated store (already flagged by Copilot's auto-review on L92 and L295).
- Fix 5 (protocol cleanup): Clean removal of the duplicate
metadata()from the Protocol; the concreteInMemoryArtifactStoreretains it as a compatibility alias.
Non-blocking suggestions:
- Consider seeding
_fact_counterfrom the store's existing max ID when constructed with a pre-populatedInMemoryFactStore(follow-up). - A regression test for the add-delete-add scenario in
test_manager.pywould be a nice addition.
| """ | ||
| fact_id = f"fact:{key}:{len(self._fact_store.all())}" | ||
| self._fact_counter += 1 | ||
| fact_id = f"fact:{key}:{self._fact_counter}" |
There was a problem hiding this comment.
Nit (nice-to-have): This fix changes runtime ID generation behavior. A small regression test in test_manager.py that does add → delete → add and asserts no fact_id collision would document the intent and prevent regressions.
Also worth noting: existing Copilot review comments on L92 and L295 correctly flag the rehydration edge case — the counter should ideally be seeded from the store's existing max suffix when constructed with a pre-populated InMemoryFactStore.
Fix 1: Dead code in jaccard() . utils.py
if not union: return 0.0 guard was unreachable — if both sets are empty the function already returned 0.0 on the line above; if either set is non-empty, union can never be empty.
Fix 2: _doc_id_to_idx not declared in init· router.py
_doc_id_to_idx was set only inside _ensure_index(), causing mypy strict to flag a potential attribute-access-before-assignment. Declared in init and removed the duplicate type annotation from _ensure_index()
Fix 3 — ttl_behavior type narrowing · config.py
Tightened the type from plain str to Literal["drop", "warn"] matching the documented valid values, enabling mypy to catch invalid assignments at type-check time.
Fix 4 — Collision-prone fact ID generation · manager.py
add_fact() used len(self._fact_store.all()) to generate IDs. If a fact was deleted and another added, the length could repeat a prior value and produce a duplicate fact_id. Replaced with a monotonically increasing instance counter.
Fix 5 — Duplicate metadata() in ArtifactStore Protocol · protocols.py
metadata(handle) and ref(handle) had identical signatures and semantics. Keeping both in the Protocol forces all implementations to define a duplicate method. Removed metadata() from the Protocol; InMemoryArtifactStore retains it as a concrete compatibility alias.