Skip to content

fix: prefer canonical summary pages in exact entity search#185

Open
RogerGimbel wants to merge 6 commits intogarrytan:masterfrom
RogerGimbel:roger/m5-gbrain-hotfixes-2026-04-15
Open

fix: prefer canonical summary pages in exact entity search#185
RogerGimbel wants to merge 6 commits intogarrytan:masterfrom
RogerGimbel:roger/m5-gbrain-hotfixes-2026-04-15

Conversation

@RogerGimbel
Copy link
Copy Markdown

Summary:

  • boost exact query matches for canonical summary/status/readme/index pages when the parent slug matches the query
  • keep compatibility stubs and same-name agent pages in results, but rank maintained canonical pages above them
  • add regression coverage for exact person/company canonical ordering in both unit and end-to-end search tests

Why:

  • exact entity lookups like "Roger Gimbel" and "Rodaco" were returning compatibility or agent pages above the maintained canonical summary pages
  • this made GBrain recall noisier for high-value people/company lookups in a digest-heavy corpus

Test plan:

  • bun test test/search.test.ts test/e2e/search-quality.test.ts
  • verified live query behavior after sync:
    • "Roger Gimbel" -> knowledge/people/roger-gimbel/summary first
    • "Rodaco" -> knowledge/companies/rodaco/summary first

* feat: port graph traversal and auto-link extraction

- add graph-query, GraphPath traversal, and backlink count support
- add shared link-extraction and DB-source extract pipeline
- add put_page auto-link reconciliation and selective-port tests
- checkpoint the v0.12 graph-layer selective port plans

* [verified] fix: widen exact-query keyword candidate pool

- widen keyword candidate selection for exact entity-like queries
- keep canonical exact-ranking boosts intact once candidates are loaded
- add Rodaco regression coverage for missed canonical company pages

* [verified] fix: boost explicit query type hints for canonical pages

- boost structured pages when the query title is followed by a matching type hint
- recover canonical agent pages for queries like Hermes Agent
- add regression coverage for explicit type-hint ranking

* [verified] fix: boost canonical company pages for ai alias queries

- boost canonical company pages when a query uses a spaced AI suffix alias
- keep the alias match scoped to company pages with a token boundary
- add Rodaco AI regression coverage and prefix-collision guards

* [verified] fix: add explicit ambiguity query preferences
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant