fix: use word-boundary regex for geo-tagging keyword matching by princelevant · Pull Request #330 · koala73/worldmonitor

princelevant · 2026-02-24T16:45:40Z

Summary

Replaced String.includes() with word-boundary regex (\b...\b) across the entire geo-tagging pipeline to prevent substring false positives
Replaced the ambiguous "hts" keyword (matched "rights", "fights", etc.) with "tahrir al-sham" / "hayat tahrir"
Added 20 regression tests covering false positive prevention, true positive preservation, and edge cases

Problem

When zooming into Syria on the map, unrelated articles (e.g. French politics mentioning "ambassador") appeared at Syria's coordinates. The keyword "assad" matched as a substring inside "ambassador", and "hts" matched inside "rights", "fights", "flights", etc.

Root cause: keywords >= 5 characters used titleLower.includes(keyword) instead of word-boundary regex.

Files changed

File	Change
`src/services/geo-hub-index.ts`	Word-boundary regex for all keyword lengths
`src/components/DeckGLMap.ts`	Hotspot keyword matching uses `\b` regex
`src/components/Map.ts`	Same fix for mobile map
`src/App.ts`	Flash location matching uses `\b` regex
`src/services/entity-index.ts`	Entity keyword matching uses `\b` regex
`src/services/country-instability.ts`	Country keyword matching uses `\b` regex
`src/services/story-data.ts`	Country keyword matching uses `\b` regex
`src/services/related-assets.ts`	Asset keyword matching uses `\b` regex
`src/utils/analysis-constants.ts`	`includesKeyword()` utility uses `\b` regex
`src/config/geo.ts`	Replaced `"hts"` with `"tahrir al-sham"` / `"hayat tahrir"`
`tests/geo-keyword-matching.test.mjs`	20 new test cases

Test plan

vite build passes clean
All 111 existing tests pass (0 regressions)
20 new tests verify: "ambassador" no longer matches Syria, "rights" no longer matches Damascus, genuine Syria/HTS articles still match correctly

Fixes #324

-KT

🤖 Generated with Claude Code

t() always returns a string (key itself if missing), so || 'English' fallbacks were unreachable dead code.

t() always returns a string, so || 'English' fallbacks were unreachable. Removed all 15 instances.

Main variant: NHK World + Nikkei Asia in asia category. Finance variant: Nikkei Asia in markets category. Added asia.nikkei.com to RSS proxy allowlist.

…keys - CommunityWidget: add DOM check to prevent duplicate widgets on repeated loadNews() calls - RuntimeConfigPanel: compare t() result against key path to suppress missing help translations

…ds, CSS)

…glish + Linux AppImage support (koala73#100) ## Summary - Full i18n system with 14 locales: en, fr, de, es, it, pl, pt, nl, sv, ru, ar (RTL), zh, ja — all at 1132-key parity - Eliminated ~110 hardcoded English strings across 50+ source files, replaced with `t()` calls - RTL support for Arabic with proper regional code normalization (ar-SA → ar) - Dead English fallback literals (`t() || 'English'`) removed from all components - Community discussion floating widget (localized) - Linux AppImage desktop build support - Proper noun heuristic fallback for trending keywords when ML unavailable ## Key changes - **New**: `src/services/i18n.ts` — i18next setup with language detection, RTL, locale switching - **New**: 13 locale JSON files (1132 keys each) in `src/locales/` - **New**: `src/styles/rtl-overrides.css` + `src/styles/lang-switcher.css` - **Modified**: 50+ components/services to use `t()` instead of hardcoded strings - **Modified**: `.github/workflows/build-desktop.yml` — Linux CI matrix - **Modified**: `scripts/desktop-package.mjs` + `download-node.sh` — Linux target support ## Test plan - [ ] Verify language switcher shows all 14 languages - [ ] Switch to Arabic — confirm `dir="rtl"` on `<html>`, layout mirrors - [ ] Switch to Japanese — confirm all panel labels, tooltips, popups render in Japanese - [ ] Switch to French — confirm no English leaks in panels, modals, map legend - [ ] Verify `{{count}}` interpolation works in timeAgo strings - [ ] Verify `tsc --noEmit` passes (confirmed locally) - [ ] Test community widget dismiss/localStorage persistence

PR koala73#97 only hid the badge itself but the SignalModal kept auto-opening on new signals. Gate all 5 automatic signalModal.show() calls behind findingsBadge.isEnabled() so disabling Intelligence Findings also suppresses the full-screen popup overlay. Closes koala73#89 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add lasillavacia.com RSS feed to improve Latin American political coverage. Independent Colombian investigative outlet covering governance, armed conflict, and regional power dynamics. Ref koala73#96 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

## Summary - Adds [La Silla Vacía](https://www.lasillavacia.com) RSS feed (`/rss`) to the `latam` feed category - Adds source tier entry (Tier 3 — specialty/investigative) - Colombian independent outlet covering political power structures, governance, and armed conflict Ref koala73#96 ## Test plan - [ ] Verify feed loads in LATAM news panel (content is in Spanish) - [ ] Confirm no duplicate or broken entries in feed list 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Summary - PR koala73#97 hid the badge but the `SignalModal` kept auto-opening on new signals — this is what the reporter was still seeing - Gates all 5 automatic `this.signalModal?.show()` calls behind `this.findingsBadge?.isEnabled()` so disabling Intelligence Findings also suppresses the full-screen popup overlay and sounds - Signal history is still recorded (`addToSignalHistory`) even when popup is suppressed, so re-enabling the toggle shows them Closes koala73#89 ## Test plan - [x] Disable Intelligence Findings via PANELS toggle or right-click - [x] Wait for signal refresh cycle — no full-screen popup should appear - [x] Re-enable → popups resume on next signal detection - [x] Build succeeds with no type errors 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…ds revamp

Initializes @sentry/browser early in main.ts with environment detection (production/preview/development). Disabled on localhost and Tauri desktop. Traces sampled at 10%.

Resolve instead of reject when the script fails to load (ad blocker, network issue). Guard initializePlayer against missing YT.Player. Prevents noisy unhandled rejection errors in Sentry.

…timeout, WebGL context loss, RSS 403s - storage.ts: add withTransaction() retry wrapper for IndexedDB InvalidStateError on iOS/Safari tab backgrounding - usa-spending.ts: add 20s AbortController timeout to prevent Safari "Load failed" on stalled POST - App.ts: add catch to runGuarded() to prevent unhandled rejections from task runner - main.ts: add Sentry ignoreErrors for WebGL context loss and ResizeObserver loop - DeckGLMap.ts: add webglcontextlost/restored handlers for graceful GPU recovery - feeds.ts: route rsshub.app feeds (NHK, MIIT, MOFCOM) through Railway proxy, switch Nikkei Asia and ECFR to Google News proxy - finance.ts: switch Nikkei Asia to Google News proxy, remove unused railwayRss helper

… extensions) - Add NotAllowedError, InvalidAccessError, importScripts to Sentry ignoreErrors - Add global unhandledrejection handler for YouTube IFrame API autoplay blocks - Add onError handler to deck.gl MapboxOverlay for internal render-cycle races

- withTransaction now returns undefined instead of throwing when InvalidStateError persists after retry (transient browser event) - Add .catch() to fire-and-forget cleanOldSnapshots() call

- Add beforeSend filter to drop minified 1-3 char library errors (e.g., "vd") - Filter transient network errors (Load failed, Failed to fetch, cancelled) - Filter browser extension errors (runtime.sendMessage, Java object is gone) - Filter non-Error promise rejections and SVG image load failures - Filter MapLibre imageManager null ref during WebGL context restore - Reset YouTube API promise on load failure to allow retry on next init - Move USASpending timeout cleanup to finally block - Log snapshot cleanup errors instead of silently swallowing

…variants Browser extensions intercept window.fetch causing "Failed to fetch (gamma-api.polymarket.com)" to leak as unhandled rejection. Remove the $ anchor so the pattern matches any suffix.

… noise filters Prevent getProjection null crash when WebGL context is lost by tracking webglLost flag and skipping all setProps/layer rebuild calls until restored. Add ignoreErrors for IndexedDB iOS kills, Twitter WebView injection, and CSP unsafe-eval from extensions.

…List guards - toggleFullscreen: use void .catch() for Promise-based requestFullscreen/ exitFullscreen + webkit prefix fallback for iOS Safari (WORLDMONITOR-11/13) - Narrow /^TypeError: Failed to fetch/ to exact match (was suppressing real API failures). Move module-import-failed to beforeSend with extension/ webview context check instead of blanket ignore (WORLDMONITOR-15) - Guard classList?.contains and target.closest?. on event targets that may not be Elements (WORLDMONITOR-Z/10) - Add noise filters: Fullscreen request denied, requestFullscreen, vc_text_indicators_context (WORLDMONITOR-12)

…er, IndexedDB write-drop - webkitRequestFullscreen returns void (not Promise) on Safari — use try/catch instead of .catch() to avoid undefined.catch() throw - Module-import beforeSend filter: only suppress when stack frames originate from browser extensions, not by URL domain check - withTransaction: throw on readwrite InvalidStateError after retry instead of silently returning undefined (prevents write-drop)

…ections - Wrap updateBaseline() in try/catch inside loadNewsCategory and intel path so IndexedDB write failures don't delete successfully fetched and rendered news data (P1) - Add .catch() to saveCurrentSnapshot() initial call and setInterval callback to prevent unhandled promise rejections from IndexedDB readwrite failures (P2)

… WebGL link errors - LiveNewsPanel: player.mute/unMute may not exist before onReady (WORLDMONITOR-16) - main.ts: add /Program failed to link/ noise filter (WORLDMONITOR-18)

…cement)

…GLSL error signal

…variants

…tion probes (koala73#296) Sidecar validation probes were missing User-Agent headers, causing Cloudflare-fronted APIs (e.g. Wingbits) to return 403 which was incorrectly treated as an auth rejection. Added CHROME_UA to all 13 probes and isCloudflare403() helper to soft-pass CDN blocks.

) Tauri WKWebView/WebView2 traps target="_blank" navigation, so news links and other external URLs silently fail to open. Added a global capture-phase click interceptor that routes cross-origin links through the existing open_url Tauri command, falling back to window.open.

…ries (koala73#299) Models like DeepSeek-R1 and QwQ output chain-of-thought as plain text even with think:false. This caused summaries like "We need to summarize the top story..." instead of actual news content. - Remove message.reasoning fallback that used thinking tokens as summary - Extend tag stripping to <|thinking|>, <reasoning>, <reflection> formats - Add hasReasoningPreamble() to reject task narration and prompt echoes - Gate reasoning detection to brief/analysis modes (translate unaffected) - Bump CACHE_VERSION v3→v4 to invalidate polluted cached summaries - Add 28 unit tests covering all edge cases

…oala73#285) * fix: sync YouTube live panel mute state with native player controls * fix: harden YouTube embed mute sync (postMessage origin, interval cleanup, DRY destroy) --------- Co-authored-by: Elie Habib <elie.habib@gmail.com>

* test: add Playwright e2e tests for flushStaleRefreshes 4 tests covering: stale services flushed on tab focus (hidden > interval), no-op when hiddenSince is 0, skips non-stale services (hidden < interval), and 150ms stagger between re-triggered services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: convert flushStaleRefreshes to fast unit test, fix timeout leaks and timing flakiness - Move from Playwright e2e to Node.js unit test (tests/ dir) - Add source contract tests to detect if App.ts method signature drifts - Clean up all timeouts in afterEach to prevent leaks - Assert ordering + minimum gaps instead of absolute time windows (CI-safe) - Add assertions for refreshTimeoutIds state after flush - Add test for non-stale service timeout preservation * test: make flush stale refresh tests deterministic --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Elie Habib <elie.habib@gmail.com>

)

…koala73#302) * fix: harden desktop embed messaging and secret validation * fix: harden embed postMessage origin check and add custom channel validation Security: - Block wildcard parentOrigin from query params (server-side sanitizer) - Validate e.origin on incoming postMessage commands in embed - Remove misleading asset: protocol from allowed list - Require 2+ markers for Cloudflare challenge detection (drop overly broad 'cloudflare' marker) - Add ordering comment on isAuthFailure vs isCloudflareChallenge403 - Strengthen embed test assertions with regex + wildcard rejection test Channel validation: - Validate YouTube handle format (@<3-30 chars>) before adding - Verify channel exists on YouTube via /api/youtube/live before adding - Show "Verifying…" loading state, red border on invalid, offline tolerance - Return channelExists flag from /api/youtube/live endpoint

* Simplify RSS freshness update to static import * Refine vendor chunking for map stack in Vite build * Patch transitive XML parser vulnerability via npm override * Shim Node child_process for browser bundle warnings * Filter known onnxruntime eval warning in Vite build * test: add loaders XML/WMS parser regression coverage * chore: align fast-xml-parser override with merged dependency set --------- Co-authored-by: Elie Habib <elie.habib@gmail.com>

…73#306) - Add levels, trends, fallback keys to top-level countryBrief in en/el/th/vi locales (fixes raw key display in intelligence brief and header badge) - Add Export PDF option to country brief dropdown using scoped print dialog - Add exportPdf i18n key to all 17 locale files

…73#308) - Add levels, trends, fallback keys to top-level countryBrief in en/el/th/vi locales (fixes raw key display in intelligence brief and header badge) - Add Export PDF option to country brief dropdown using scoped print dialog - Add exportPdf i18n key to all 17 locale files

…koala73#311)

…lity (koala73#313) WKWebView (Tauri macOS) doesn't support HTML5 Drag and Drop API. Replace draggable/dragstart/dragover with mousedown/mousemove/mouseup across panel grid reorder, live channel tabs, and channel settings. Uses elementFromPoint with same-row detection for accurate horizontal and vertical drag positioning.

…ala73#315) - Add panelDragCleanupHandlers to remove document listeners on destroy - Suppress channel click/edit after drag-end to prevent accidental actions

…oala73#316) Adds ignoreErrors patterns for Worker constructor, Facebook in-app browser, UC Browser, duplicate custom elements, WebGPU device limits, and stale container. Extends beforeSend to suppress TypeErrors from deck-stack chunk (same pattern as maplibre map chunk).

) * feat: add AI analysis settings popup to Insights panel (web-only) Add a gear icon to the AI Insights panel header that opens a settings popup giving web users explicit control over the AI analysis pipeline. Users can now toggle cloud AI (Groq/OpenRouter) and browser local model independently, with a static CTA for Ollama desktop support. - New ai-flow-settings.ts state layer with localStorage persistence - SummarizeOptions param added to generateSummary() (backward-compatible) - InsightsPanel: gear icon, disabled state, generation token for races - AiFlowPopup: toggles, 250MB warning, status footer, Ollama CTA - Remove mlWorker.isAvailable gate in App.ts for cloud-only mode - CSS: popup, toggles, status indicators, disabled state - i18n: 16 new keys across all 17 locale files with translations https://claude.ai/code/session_01AgLDUybKNri83vgZQNC3HF * fix: reset brief cache on settings change, remove dead code in popup - Reset cachedBrief and lastBriefUpdate in onAiFlowChanged() so new provider settings take effect immediately instead of being blocked by the 2-minute cooldown with a stale (possibly null) cached brief - Remove unused isAnyAiProviderEnabled() import and dead `void any` in AiFlowPopup.updateStatus() https://claude.ai/code/session_01AgLDUybKNri83vgZQNC3HF * fix: invalidate insights brief cache on AI flow changes --------- Co-authored-by: Claude <noreply@anthropic.com>

…ala73#317) Adds islandtimes.org/feed/ to the asia region feeds and allowlists the domain in the RSS proxy.

…re source regions (koala73#319) Replace 4 scattered settings UIs (gear popup, panels modal, sources modal, language dropdown) with a single 3-tab modal (General/Panels/Sources). Sources tab features region pills that dynamically adapt per variant: - Full: Worldwide, US, Europe, Middle East, Africa, Latin America, Asia-Pacific, Topical, Intelligence - Tech: Tech News, AI & ML, Startups & VC, Regional Ecosystems, Developer, Cybersecurity, Policy & Research, Media & Podcasts - Finance: Markets & Analysis, Fixed Income & FX, Commodities, Crypto & Digital, Central Banks & Economy, Deals & Corporate, Financial Regulation, Gulf & MENA Also reclassifies full-variant feeds: splits monolithic politics into politics (worldwide), us, and europe; redistributes misplaced sources. Additional fixes: - Variant switcher works on localhost via localStorage (no multiple dev servers) - mapNewsFlash toggle no longer triggers expensive AI re-analysis - Remove dead intel-findings toggle from desktop settings window - LiveNewsPanel uses shared SITE_VARIANT (respects localStorage override)

…3#324) Keyword matching across the geo-tagging pipeline used String.includes() (substring matching), causing false positives like "assad" matching inside "ambassador" and tagging unrelated articles to Syria. Replaced all instances with word-boundary regex (\b...\b) for accurate matching. Also replaced the ambiguous 3-char "hts" keyword (matched "rights", "fights", etc.) with unambiguous "tahrir al-sham" / "hayat tahrir". Fixes koala73#324 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-24T16:45:44Z

@princelevant is attempting to deploy a commit to the Elie Team on Vercel.

A member of the Team first needs to authorize it.

koala73 · 2026-02-24T17:10:56Z

Lovely
Was on my todo
Thank you
Will review

koala73 · 2026-02-24T17:21:09Z

Plan vs Implementation Review

Thanks for tackling #324! The core goal (fixing substring false positives) is right, but the implementation diverges from the approved plan in ways that introduce new issues. Here's a detailed comparison.

Approach Mismatch

The approved plan uses tokenization-based exact word matching (Set.has()), not \b word-boundary regex. Tokenization was chosen because \b still has edge cases with common English words used as keywords (e.g., 'oil', 'fed', 'house'), while tokenization eliminates ALL substring false positives by design:

"ambassador" → tokens: {"ambassador"} → has("assad")? NO ✓
"Assad regime" → tokens: {"assad","regime"} → has("assad")? YES ✓

Issues

#	Severity	Issue	Detail
1	Critical	Wrong approach	Uses `\b` regex, not tokenization. `\b` still has edge cases with common words like `'oil'`, `'fed'`, `'house'` (still in the DC keyword list at geo.ts:84)
2	Critical	Removes `'hts'` keyword	Replaced with `'tahrir al-sham'`/`'hayat tahrir'` only. Headlines saying just "HTS" (very common: "HTS forces advance") no longer match Damascus. With tokenization, keeping `'hts'` is safe since `tokens.has('hts')` ≠ `"rights"`
3	Critical	No regex cache — performance regression	`new RegExp()` created on EVERY call in the hot loop. DeckGLMap: 100 news × 33 hotspots × 8 keywords = 26,400 RegExp allocations per render cycle (runs every few minutes)
4	High	Changed shared `includesKeyword()`	Modified `analysis-constants.ts:188` which affects `analysis-core.ts:313,347` (correlation/signal generation, not geo-tagging). Plan explicitly creates a separate `src/utils/keyword-match.ts` to avoid regression in non-geo paths
5	High	No centralized utility	The escape+regex pattern is copy-pasted 10+ times across 8 files. If matching logic changes, every site needs updating again
6	Medium	Missing files	`tech-hub-index.ts:221` and server-side `get-risk-scores.ts` not updated — false positives persist there
7	Medium	`'us '` and `'house'` not fixed	DC hotspot (geo.ts:84) still has `'us '` (trailing space hack for `.includes()`) and standalone `'house'`. With `\b` regex, `\bus \b` may behave unexpectedly
8	Low	Tests lack integration coverage	All 20 tests are unit tests on the regex function. No tests against actual `inferGeoHubsFromTitle()`, `normalizeCountryName()`, or hotspot matching

What the PR Gets Right

Correct file coverage for core geo-tagging paths (8 of 10 files)
App.ts:findFlashLocation() included
Solid test cases for the "ambassador"/"assad" false positive
escapeRegex() used consistently for safety
Conflict-topic .includes() in DeckGLMap correctly converted

Recommended Changes

Per the approved plan (/plans/dapper-tinkering-engelbart.md):

Create src/utils/keyword-match.ts with tokenizeForMatch() + matchKeyword() — single source of truth, tokenize once per title then O(1) Set lookups
Keep 'hts' in Damascus keywords — tokenization makes it safe (no "rights" false positive)
Tokenize once per title in hot loops, reuse across all hotspot keyword checks (faster than 26K regex allocations)
Don't touch analysis-constants.ts — isolate geo-matching to avoid blast radius in analysis-core
Add tech-hub-index.ts and get-risk-scores.ts to scope
Remove 'us ' and 'house' from DC hotspot keywords
Add integration tests for inferGeoHubsFromTitle() and normalizeCountryName()

The plan file has the full tokenizeForMatch() and matchKeyword() implementation with contiguous phrase matching for multi-word keywords.

…73#324) Replace word-boundary regex with tokenization + Set lookups per approved plan: - Create src/utils/keyword-match.ts as single source of truth - Tokenize titles once, O(1) Set.has() per keyword (no RegExp allocations) - Restore 'hts' keyword for Damascus (safe with tokenization) - Revert shared includesKeyword() in analysis-constants.ts - Remove 'us ' trailing-space hack and bare 'house' from DC keywords - Add tech-hub-index.ts to scope (was missing) - Add integration tests for inferGeoHubsFromTitle flow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

princelevant · 2026-02-24T19:29:05Z

Hey @koala73 — this is a great initiative and I'm happy to contribute early on. The impact is huge. Thank you for the quick and prompt responses!

Here's the fix based on your feedback:

Changes in this revision:

Tokenization over regex — replaced all \b regex matching with tokenizeForMatch() + Set.has() lookups per the approved plan. Titles are tokenized once, then O(1) keyword checks — zero RegExp allocations in hot loops
Centralized utility — new src/utils/keyword-match.ts as single source of truth, all 10 files import from it
Restored 'hts' in Damascus keywords — tokenization makes it safe (no more "rights"/"flights" false positives)
Reverted analysis-constants.ts — includesKeyword() back to original, geo-matching is now fully isolated
Added tech-hub-index.ts to scope (was missing)
Removed 'us ' and 'house' from DC hotspot keywords
Integration tests added for the full inferGeoHubsFromTitle flow (41 tests, all passing)

Let me know if anything else needs adjusting. Yalla! 🚀

— KT

koala73 · 2026-02-24T22:37:40Z

Hey @princelevant — great improvement switching to tokenization! The architecture is now aligned with the approved plan: keyword-match.ts as single source of truth, Set.has() for O(1) lookups, contiguous phrase matching for multi-word keywords. Nice work.

One critical issue remaining before we can merge:

🔴 CRITICAL: Possessive forms produce false negatives

The tokenizer splits on /[^a-z0-9'-]+/ — preserving apostrophes within words. This means possessive headlines (extremely common in news) miss genuine matches:

"Assad's forces advance in Idlib"  → token: "assad's" → has("assad") = FALSE ❌
"Iran's nuclear program expands"   → token: "iran's"  → has("iran")  = FALSE ❌
"Putin's war enters new phase"     → token: "putin's" → has("putin") = FALSE ❌
"Trump's tariff plan draws criticism" → token: "trump's" → has("trump") = FALSE ❌

The approved plan specified compound + sub-part decomposition to handle this. After adding each cleaned token, split on /[^a-z0-9]+/ and add the sub-parts:

export function tokenizeForMatch(title: string): TokenizedTitle {
  const lower = title.toLowerCase();
  const words = new Set<string>();
  const ordered: string[] = [];
  for (const raw of lower.split(/\s+/)) {
    const cleaned = raw.replace(/^[^a-z0-9]+|[^a-z0-9]+$/g, '');
    if (!cleaned) continue;
    words.add(cleaned);           // "assad's" as compound
    ordered.push(cleaned);
    for (const part of cleaned.split(/[^a-z0-9]+/)) {
      if (part) words.add(part);  // "assad", "s" as sub-parts
    }
  }
  return { words, ordered };
}

This gives tokens.has("assad") === true even when the headline says "Assad's". Same fix covers hyphenated forms like "al-Shabaab" → sub-parts include "shabaab".

Please also add test cases for possessives — that's how this slipped through:

it('"assad" matches "Assad\'s forces advance"', () => {
  assert.equal(matchesAnyKeyword("Assad's forces advance in Idlib", ['assad']), true);
});

🟡 Minor items

entity-index.ts still uses new RegExp(\b...\b) — acceptable since it needs match position, but worth a comment explaining the deviation.
PR title & description still reference "word-boundary regex" from commit 1. Update to reflect the tokenization approach.
server/.../get-risk-scores.ts was in the plan (needs inline copy of tokenization) — can be a follow-up PR if you prefer.

Everything else looks solid — the keyword data fixes in geo.ts, the analysis-constants.ts isolation, tokenize-once-reuse-across-hotspots pattern, and the test coverage. Just need that possessive fix and we're good to go. 👍

koala73 and others added 30 commits February 18, 2026 07:43

fix(i18n): remove dead English fallback literals in CountryIntelModal

d559af5

t() always returns a string (key itself if missing), so || 'English' fallbacks were unreachable dead code.

fix(i18n): remove dead English fallback literals in StoryModal

79f9f49

t() always returns a string, so || 'English' fallbacks were unreachable. Removed all 15 instances.

feat(feeds): add NHK World and Nikkei Asia RSS feeds for Japan coverage

304a04c

Main variant: NHK World + Nikkei Asia in asia category. Finance variant: Nikkei Asia in markets category. Added asia.nikkei.com to RSS proxy allowlist.

feat(feeds): add NHK World and Nikkei Asia RSS feeds for Japan coverage

20d26ee

Main variant: NHK World + Nikkei Asia in asia category. Finance variant: Nikkei Asia in markets category. Added asia.nikkei.com to RSS proxy allowlist.

fix: community widget idempotency guard + suppress missing i18n help …

be2a01d

…keys - CommunityWidget: add DOM check to prevent duplicate widgets on repeated loadNews() calls - RuntimeConfigPanel: compare t() result against key path to suppress missing help translations

chore: bump version to 2.3.9

c798ba9

merge: resolve conflicts with main (community widget, trending keywor…

723a622

…ds, CSS)

docs: add v2.3.9 changelog

9ba2315

feat(i18n): comprehensive localization, RTL support, and regional fee…

1d2300c

…ds revamp

feat: integrate Sentry browser error tracking

a54da82

Initializes @sentry/browser early in main.ts with environment detection (production/preview/development). Disabled on localhost and Tauri desktop. Traces sampled at 10%.

fix: gracefully handle YouTube IFrame API load failure

8dabec9

Resolve instead of reject when the script fails to load (ad blocker, network issue). Guard initializePlayer against missing YT.Player. Prevents noisy unhandled rejection errors in Sentry.

fix: gracefully handle IndexedDB connection-closing errors on iOS

a2ff961

- withTransaction now returns undefined instead of throwing when InvalidStateError persists after retry (transient browser event) - Add .catch() to fire-and-forget cleanOldSnapshots() call

fix(sentry): broaden Failed to fetch filter to catch domain-suffixed …

35cffab

…variants Browser extensions intercept window.fetch causing "Failed to fetch (gamma-api.polymarket.com)" to leak as unhandled rejection. Remove the $ anchor so the pattern matches any suffix.

fix: add console.warn to silent storage catch blocks for diagnosability

df9e834

fix: guard YT player .mute()/.unMute() with optional chaining, filter…

f324652

… WebGL link errors - LiveNewsPanel: player.mute/unMute may not exist before onReady (WORLDMONITOR-16) - main.ts: add /Program failed to link/ noise filter (WORLDMONITOR-18)

fix(sentry): filter maplibre internal null-access crashes (light, pla…

bb10add

…cement)

fix(sentry): narrow shader link filter to null-only to preserve real …

2a53688

…GLSL error signal

fix(sentry): broaden Failed to fetch filter to match domain-suffixed …

daa61c7

…variants

koala73 and others added 22 commits February 24, 2026 05:36

fix: increase live channels window size to fit channel grid (koala73#301

e42f0bb

)

fix: add Greek flag mapping to language selector (koala73#305)

cea43c1

fix: add Greek flag mapping to language selector (koala73#307)

8985ff0

fix: open channel settings as inline modal instead of separate window (…

b1129d9

…koala73#311)

feat: add Bild RSS feed scoped to German locale (koala73#312)

ad2bd60

fix: add drag cleanup handlers and suppress click after drag-drop (ko…

0dd2494

…ala73#315) - Add panelDragCleanupHandlers to remove document listeners on destroy - Suppress channel click/edit after drag-end to prevent accidental actions

feat: add Island Times (Palau) RSS feed for Asia Pacific coverage (ko…

2fef4cb

…ala73#317) Adds islandtimes.org/feed/ to the asia region feeds and allowlists the domain in the RSS proxy.

chore: remove unused WORLDPOP_API_KEY from .env.example (koala73#318)

ddee84e

koala73 assigned princelevant Feb 26, 2026

koala73 force-pushed the main branch from cc2088a to 74de5f3 Compare February 27, 2026 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use word-boundary regex for geo-tagging keyword matching#330

fix: use word-boundary regex for geo-tagging keyword matching#330
princelevant wants to merge 1184 commits intokoala73:mainfrom
princelevant:fix/geo-tagging-substring-matching

princelevant commented Feb 24, 2026

Uh oh!

vercel bot commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

Uh oh!

princelevant commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Conversation

princelevant commented Feb 24, 2026

Summary

Problem

Files changed

Test plan

Uh oh!

vercel bot commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

Plan vs Implementation Review

Approach Mismatch

Issues

What the PR Gets Right

Recommended Changes

Uh oh!

princelevant commented Feb 24, 2026

Uh oh!

koala73 commented Feb 24, 2026

🔴 CRITICAL: Possessive forms produce false negatives

🟡 Minor items

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants