Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,11 @@ Each language folder in `webapp/data/languages/` contains a `SOURCES.md` with de

## TODO

- [ ] Word definitions — show the definition of the daily word after the game (e.g. via Wiktionary API)
- [x] Word definitions — show the definition of the daily word after the game (via Wiktionary API)
- [ ] Native speaker review of daily word lists for remaining languages
- [ ] Consolidate per-language data files — there are currently 3 overlapping mechanisms controlling daily word selection: `_daily_words.txt` (curated subset), `_blocklist.txt` (exclusion list), and `_curated_schedule.txt` (day-by-day override), plus the fallback to `_5words.txt`. These could be unified into a single curated daily list per language.
- [ ] User accounts — persistent game history, cross-device sync, leaderboards (passkeys / magic links / OAuth TBD)
- [ ] Comments on daily word pages — community discussion, word quality feedback (Giscus MVP, custom system later with accounts)

## Credits

Expand Down
76 changes: 76 additions & 0 deletions docs/CURATED_WORDS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Curated Words Registry

**DO NOT regenerate these word lists from the data pipeline without preserving manual curation.**

This file tracks which languages have been manually curated and should not be overwritten by running `scripts/languages.ipynb`.

## How Curation Works

1. Words are extracted for the next 365 days using the daily word algorithm
2. An LLM or native speaker reviews the words for quality issues
3. Bad words are removed or reordered in the word list
4. The language is marked as "curated" below with the curation date

## Protected Languages

When running the data pipeline, **skip these languages** or merge changes carefully:

| Language | Code | Curated Date | Curator | Notes |
|----------|------|--------------|---------|-------|
| Bulgarian | bg | 2026-01-25 | Claude | Removed 728 proper nouns |
| Turkish | tr | 2026-01-25 | Claude | Removed 21 names/places, blocklist created |
| Hungarian | hu | 2026-01-25 | Claude | Removed 39 names/foreign words, blocklist created |
| Arabic | ar | 2026-02-23 | Script | Char difficulty filter (3%): removed 212 words with rare chars, 1,788 daily words |
| Hebrew | he | 2026-02-23 | Script | Suffix dedup + wordfreq filter: 1,442 blocklist additions, 1,000 daily words (100% wordfreq-verified) |

## Curation Checklist

Before marking a language as curated, verify:

- [ ] Removed proper nouns (names, places, brands)
- [ ] Removed obscure/archaic words
- [ ] Removed offensive words
- [ ] Removed grammatical forms (conjugations, declensions) if not common
- [ ] Removed borrowed words that don't fit the language
- [ ] Verified next 365 words are all reasonable daily words
- [ ] Tests pass: `pytest tests/test_word_lists.py -v -k "{lang}"`

## How to Safely Update a Curated Language

If you need to regenerate a curated language's word list:

1. **Export current curated list**:
```bash
python scripts/curate_words.py backup {lang}
```

2. **Run the pipeline** to get new words

3. **Apply blocklist** to remove known bad words:
```bash
python scripts/curate_words.py apply-blocklist {lang}
```

4. **Merge carefully**:
- Keep the curated word order for indices 0 to current_index + 365
- Append new words that weren't in the old list
- Remove words that were deliberately deleted

5. **Update this file** with new curation date

## Blocklist Files

Each curated language can have a `{lang}_blocklist.txt` file containing words to automatically exclude.
Format: one word per line, comments start with `#`.

To apply all blocklists after regenerating:
```bash
python scripts/curate_words.py apply-all-blocklists
```

## Word Index Reference

Current word index (Jan 2026): ~1681
- Words 0-1680: Already shown to users
- Words 1681-2046: Next year (must be curated)
- Words 2047+: Future (can be regenerated)
16 changes: 14 additions & 2 deletions frontend/src/definitions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,14 +104,26 @@ export function renderDefinitionCard(
* The image loads directly via GET — if it 404s or fails, the container is hidden.
* If the image isn't cached, the server generates it (may take 15-20s).
*/
export function renderWordImage(word: string, lang: string, container: HTMLElement): void {
export function renderWordImage(
word: string,
lang: string,
container: HTMLElement,
linkUrl?: string
): void {
const url = `/${lang}/api/word-image/${encodeURIComponent(word)}`;
const img = document.createElement('img');
img.className = 'w-full max-h-48 object-contain rounded-lg';
img.alt = word;
img.onload = () => {
container.innerHTML = '';
container.appendChild(img);
if (linkUrl) {
const a = document.createElement('a');
a.href = linkUrl;
a.appendChild(img);
container.appendChild(a);
} else {
container.appendChild(img);
}
container.style.display = 'block';
};
img.onerror = () => {
Expand Down
26 changes: 24 additions & 2 deletions frontend/src/game.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ interface GameData {
total_stats: TotalStats;
languages: Record<string, LanguageInfo>;
shareButtonState: 'idle' | 'success';
communityPercentile: number | null;
communityTotal: number;
communityStatsLink: string | null;
}

export const createGameApp = () => {
Expand Down Expand Up @@ -299,6 +302,9 @@ export const createGameApp = () => {
n_losses: 0,
},
languages: {},
communityPercentile: null,
communityTotal: 0,
communityStatsLink: null,
};
},

Expand Down Expand Up @@ -1283,7 +1289,8 @@ export const createGameApp = () => {
const imageContainer = document.getElementById('word-image-card');
if (imageContainer) {
showImageLoading(imageContainer);
renderWordImage(this.todays_word, langCode, imageContainer);
const wordPageUrl = `/${langCode}/word/${this.todays_idx}`;
renderWordImage(this.todays_word, langCode, imageContainer, wordPageUrl);
}
}
},
Expand All @@ -1302,7 +1309,22 @@ export const createGameApp = () => {
attempts: typeof attempts === 'number' ? attempts : 0,
won,
}),
}).catch(() => {}); // Fire and forget
})
.then((resp) => (resp.ok ? resp.json() : null))
.then((stats) => {
if (!stats || !stats.total || !won) return;
const playerAttempts = typeof attempts === 'number' ? attempts : 7;
let worsePlayers = stats.losses || 0;
for (let i = playerAttempts + 1; i <= 6; i++) {
worsePlayers += stats.distribution?.[String(i)] || 0;
}
this.communityPercentile = Math.round(
(worsePlayers / stats.total) * 100
);
this.communityTotal = stats.total;
this.communityStatsLink = `/${langCode}/word/${dayIdx}`;
})
.catch(() => {});
} catch {
// Ignore errors
}
Expand Down
172 changes: 12 additions & 160 deletions frontend/src/index-app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,31 +26,6 @@ interface GameResult {
date: string;
}

interface LanguageStats {
n_wins: number;
n_losses: number;
n_games: number;
n_attempts: number;
avg_attempts: number;
win_percentage: number;
longest_streak: number;
current_streak: number;
}

interface TotalStats {
total_games: number;
game_stats: Record<string, LanguageStats>;
languages_won: string[];
total_win_percentage: number;
longest_overall_streak: number;
current_overall_streak: number;
longest_language_streak: number;
current_longest_language_streak: number;
current_longest_language_streak_language: string;
n_victories: number;
n_losses: number;
}

// Extend Window interface for homepage globals
declare global {
interface Window {
Expand All @@ -74,7 +49,6 @@ export default function createIndexApp(): App {
return {
showPopup: false,
showAboutModal: false,
showStatsModal: false,
showSettingsModal: false,
clickedLanguage: '',
darkMode: document.documentElement.classList.contains('dark'),
Expand All @@ -90,9 +64,7 @@ export default function createIndexApp(): App {
languages_vis: [] as Language[],

search_text: '',
total_stats: {} as TotalStats,
game_results: {} as Record<string, GameResult[]>,
expandedLanguage: '' as string, // For stats modal expansion
detectedLanguage: null as Language | null,
};
},
Expand Down Expand Up @@ -121,7 +93,6 @@ export default function createIndexApp(): App {
// Load preferences
this.loadFeedbackPreference();

this.total_stats = this.calculateTotalStats();
// Initialize languages with recently played first
this.languages_vis = this.getSortedLanguages();
// Detect browser language for hero CTA
Expand All @@ -139,7 +110,6 @@ export default function createIndexApp(): App {
methods: {
keyDown(event: KeyboardEvent): void {
if (event.key === 'Escape') {
this.showStatsModal = false;
this.showAboutModal = false;
this.showSettingsModal = false;
}
Expand Down Expand Up @@ -319,21 +289,21 @@ export default function createIndexApp(): App {
},

getCurrentStreak(language_code: string): number {
const stats = this.total_stats.game_stats?.[language_code];
return stats?.current_streak ?? 0;
const results = this.game_results[language_code];
if (!results) return 0;
let streak = 0;
for (let i = results.length - 1; i >= 0; i--) {
if (results[i].won) streak++;
else break;
}
return streak;
},

getWinRate(language_code: string): number {
const stats = this.total_stats.game_stats?.[language_code];
return stats?.win_percentage ?? 0;
},

toggleLanguageExpansion(language_code: string): void {
if (this.expandedLanguage === language_code) {
this.expandedLanguage = '';
} else {
this.expandedLanguage = language_code;
}
const results = this.game_results[language_code];
if (!results || results.length === 0) return 0;
const wins = results.filter((r: GameResult) => r.won).length;
return (wins / results.length) * 100;
},

filterWordles(search_text: string): void {
Expand Down Expand Up @@ -368,124 +338,6 @@ export default function createIndexApp(): App {
this.languages_vis = visible_languages;
}
},

calculateStats(language_code: string): LanguageStats {
const results = this.game_results[language_code];
if (!results) {
return {
n_wins: 0,
n_losses: 0,
n_games: 0,
n_attempts: 0,
avg_attempts: 0,
win_percentage: 0,
longest_streak: 0,
current_streak: 0,
};
}

let n_wins = 0;
let n_losses = 0;
let n_attempts = 0;
let current_streak = 0;
let longest_streak = 0;

for (const result of results) {
if (result.won) {
n_wins++;
current_streak++;
longest_streak = Math.max(longest_streak, current_streak);
} else {
n_losses++;
current_streak = 0;
}
n_attempts += result.attempts;
}

const total = n_wins + n_losses;
return {
n_wins,
n_losses,
n_games: results.length,
n_attempts,
avg_attempts: results.length > 0 ? n_attempts / results.length : 0,
win_percentage: total > 0 ? (n_wins / total) * 100 : 0,
longest_streak,
current_streak,
};
},

calculateTotalStats(): TotalStats {
let n_victories = 0;
let n_losses = 0;
let current_overall_streak = 0;
let longest_overall_streak = 0;
let longest_language_streak = 0;
let current_longest_language_streak = 0;
let current_longest_language_streak_language = '';
const languages_won: string[] = [];
const game_stats: Record<string, LanguageStats> = {};

// Collect and sort all results by date
const all_results: (GameResult & { language?: string })[] = [];
for (const [language_code, results] of Object.entries(this.game_results) as [
string,
GameResult[],
][]) {
for (const result of results) {
all_results.push({ ...result, language: language_code });
}
}
all_results.sort((a, b) => new Date(a.date).getTime() - new Date(b.date).getTime());

// Calculate overall streaks
for (const result of all_results) {
if (result.won) {
n_victories++;
current_overall_streak++;
longest_overall_streak = Math.max(
longest_overall_streak,
current_overall_streak
);
} else {
n_losses++;
current_overall_streak = 0;
}
}

// Calculate per-language stats
for (const language_code of Object.keys(this.game_results)) {
const stats = this.calculateStats(language_code);
game_stats[language_code] = stats;

if (stats.n_wins > 0) {
languages_won.push(language_code);
}
longest_language_streak = Math.max(
longest_language_streak,
stats.longest_streak
);
if (stats.current_streak > current_longest_language_streak) {
current_longest_language_streak = stats.current_streak;
current_longest_language_streak_language = language_code;
}
}

const total_games = n_victories + n_losses;
return {
total_games,
game_stats,
languages_won,
total_win_percentage: total_games > 0 ? (n_victories / total_games) * 100 : 0,
longest_overall_streak,
current_overall_streak,
longest_language_streak,
current_longest_language_streak,
current_longest_language_streak_language,
n_victories,
n_losses,
};
},
},
});
}
Loading