Skip to content

Actual Esperanto#134

Merged
Hugo0 merged 9 commits intoHugo0:mainfrom
HaleyHalcyon:main
Feb 26, 2026
Merged

Actual Esperanto#134
Hugo0 merged 9 commits intoHugo0:mainfrom
HaleyHalcyon:main

Conversation

@HaleyHalcyon
Copy link
Contributor

@HaleyHalcyon HaleyHalcyon commented Feb 25, 2026

You seem to have fundamentally misunderstood Issue #102. The problem is not that just one word is wrong (webapp/data/languages/eo/eo_blocklist.txt), but that the entire dictionary is wrong!

Here are the first 20 words of the current dictionary for the Esperanto version. None of these are valid Esperanto words, as a dictionary search shows:

  • ildik: no results
  • fagot: no results; fagoto = bassoon
  • ĉeest: no results; ĉeesti = to attend an event
  • altdi: no results;
  • konik: no results; koniko = cone
  • bigot: no results; bigoto = bigot
  • grunt: no results; grunti = bleat, like a pig
  • enbat: no results; enbati = to beat into (e.g. to beat an idea into someone's head)
  • bemol: no results; bemolo = [music] flat
  • ĥalde: no results; ĥaldeo = [historical] Chaldean (person)
  • batak: no results
  • ŝafid: no results; ŝafido = lamb; sheep's offspring
  • brems: no results; bremso = brake; [computing] the break key
  • remiz: no results; remizo = depot; carriage house
  • ventr: no results; ventro = belly
  • minut: no results; minuto = minute (of time)
  • manat: no results; manato = manatee
  • hagen: no results
  • vinĉj: no results; probably a proper name, as -ĉjo is the male diminutive ending
  • elrab: no results

This pull request replaces the entire dictionary using the data I've collected and processed for my browser game Intervorto.

This pull request also replaces the on-screen keyboard with the de-facto standard ŜĜERTŬ keyboard layout (though the positions of Ĥ and Ĵ are not quite universal between users/systems in real life).

Summary by CodeRabbit

  • Documentation

    • Updated Esperanto language source documentation with consolidated references.
  • Improvements

    • Refreshed Esperanto daily word list for enhanced gameplay.
    • Reorganized Esperanto keyboard layout for better accessibility.
    • Refined Esperanto interface translations and text clarity.
    • Added official language code metadata for Esperanto support.
    • Updated word filtering system.

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

Warning

Rate limit exceeded

@Hugo0 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 41 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 49f7693 and 70821d9.

📒 Files selected for processing (5)
  • webapp/data/languages/eo/SOURCES.md
  • webapp/data/languages/eo/eo_5words.txt
  • webapp/data/languages/eo/eo_5words_supplement.txt
  • webapp/data/languages/eo/eo_daily_words.txt
  • webapp/data/languages/eo/language_config.json
📝 Walkthrough

Walkthrough

This pull request updates Esperanto language support by replacing the daily word list with a frequency-based set, removing a blocklist entry, rearranging the keyboard layout, refactoring source documentation, and updating UI text with improved Esperanto phrasing and new language code identifiers.

Changes

Cohort / File(s) Summary
Word data files
webapp/data/languages/eo/eo_daily_words.txt, webapp/data/languages/eo/eo_blocklist.txt
Complete replacement of daily word list with 988 new frequency-ranked words (removing 1988 entries); single blocklist entry ("arnik") removed.
Configuration and UI
webapp/data/languages/eo/language_config.json, webapp/data/languages/eo/eo_keyboard.json
UI text trimmed and improved with corrected Esperanto phrasing; language code fields added ("epo"); keyboard layout rows rearranged to reposition special characters and diacritics.
Documentation
webapp/data/languages/eo/SOURCES.md
Narrative restructured to consolidate word source and frequency data descriptions; author attribution added for Haley Wakamatsu and processing provenance documented; licensing unified under CC-BY-SA 4.0.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 Esperanto hops with joy, words now ranked by frequency's choice!
New keyboard dances, layouts prance, while UI text finds brighter voice.
From arnik gone to sources shown, this lexicon's a work well-known! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Actual Esperanto' directly captures the main objective: replacing invalid Esperanto word data with correct, validated entries from actual Esperanto sources.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
webapp/data/languages/eo/SOURCES.md (1)

13-17: Consider keeping heading levels consistent under “Frequency data.”

## Modifications reads as a higher-level section than ### Frequency data; making it a subsection would improve doc structure.

🧹 Suggested doc-structure tweak
 ### Frequency data
 
-## Modifications
+#### Modifications
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@webapp/data/languages/eo/SOURCES.md` around lines 13 - 17, Change the heading
level for "Modifications" to match "Frequency data" by replacing the "##
Modifications" line with "### Modifications" so it becomes a subsection under
"### Frequency data"; update any adjacent headings if needed to keep the
document hierarchy consistent (refer to the headings "Frequency data" and
"Modifications" in the SOURCES.md content).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@webapp/data/languages/eo/eo_daily_words.txt`:
- Line 386: Remove the non-Esperanto and improper proper-name entries from the
Esperanto daily word list: delete "times" and the proper names "trump", "putin",
"david", and "blair" from eo_daily_words.txt; keep the validated Esperanto roots
like "labor", "sankt", and "milit" as-is. Ensure the file contains only valid
Esperanto words/roots (per ReVo) and update any tests or lookup logic that
assumed those removed entries are present.

In `@webapp/data/languages/eo/language_config.json`:
- Line 16: Replace the non-standard Esperanto clipboard term in the localization
value for the key "notification-copied": change the string from "Kopiita al
poŝo" to the standard "Kopiita al tondujo" so the UI uses the accepted term for
"clipboard" (update the value associated with "notification-copied" in the
language_config JSON).

In `@webapp/data/languages/eo/SOURCES.md`:
- Line 18: Fix the typo in the SOURCES.md entry for `eo_daily_words.txt`:
replace the word "frequest" with "frequent" in the line that reads
"`eo_daily_words.txt`: Top 999 most frequest words" so it correctly reads
"`eo_daily_words.txt`: Top 999 most frequent words"; update that exact string in
the file.

---

Nitpick comments:
In `@webapp/data/languages/eo/SOURCES.md`:
- Around line 13-17: Change the heading level for "Modifications" to match
"Frequency data" by replacing the "## Modifications" line with "###
Modifications" so it becomes a subsection under "### Frequency data"; update any
adjacent headings if needed to keep the document hierarchy consistent (refer to
the headings "Frequency data" and "Modifications" in the SOURCES.md content).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99ea41c and 49f7693.

📒 Files selected for processing (7)
  • webapp/data/languages/eo/SOURCES.md
  • webapp/data/languages/eo/eo_5words.txt
  • webapp/data/languages/eo/eo_5words_supplement.txt
  • webapp/data/languages/eo/eo_blocklist.txt
  • webapp/data/languages/eo/eo_daily_words.txt
  • webapp/data/languages/eo/eo_keyboard.json
  • webapp/data/languages/eo/language_config.json
💤 Files with no reviewable changes (1)
  • webapp/data/languages/eo/eo_blocklist.txt

- Remove 10 non-Esperanto words containing x/y/w (yahoo, index, world, etc.)
- Remove proper names from main and daily lists (trump, putin, david, blair, times, korea, china)
- Remove 2,772 duplicate words from supplement (was identical to main list)
- Remove 83 supplement words with invalid characters (x/y/w)
- Fix clipboard translation: poŝo → tondujo (standard Esperanto)
- Fix SOURCES.md: typo (frequest→frequent), heading hierarchy, updated counts
Copy link
Owner

@Hugo0 Hugo0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Data cleanup needed to pass CI

Great work on the ReVo-based word list! The ŜĜERTŬ keyboard and improved translations are solid. There are a few data issues causing the 5 CI test failures — all straightforward to fix:

1. Remove 10 non-Esperanto words from eo_5words.txt

These contain x, y, or w which aren't in the Esperanto alphabet:

yahoo, index, world, watch, party, royal, power, woods, emysl, young

2. Remove proper names/English words from eo_5words.txt and eo_daily_words.txt

trump, putin, david, blair, times, korea, china

(china is only in main, the other 6 are in both)

3. Deduplicate supplement

All 2,772 main words appear in the supplement — supplement should only contain words not in the main list. Also remove 83 supplement words with x/y/w. After cleanup: 4,343 supplement words.

4. Minor fixes

  • language_config.json: "Kopiita al poŝo""Kopiita al tondujo" (standard Esperanto for clipboard)
  • SOURCES.md: frequestfrequent, fix heading hierarchy (### Frequency data is empty and orphaned — move content under it)
  • SOURCES.md: Update word counts to match cleaned files (2755 main, 993 daily, 4343 supplement)

After these fixes

All 17 Esperanto tests pass ✅ (946 total tests pass, 0 failures)

I've prepared these fixes on branch HaleyHalcyon/main in the upstream repo if you'd like to cherry-pick: 70821d9

Copy link
Owner

@Hugo0 Hugo0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I've pushed the fixes directly to this PR branch (commit 70821d9). CI should re-run and pass now. No action needed from you!

@Hugo0 Hugo0 merged commit 3bc02d7 into Hugo0:main Feb 26, 2026
4 checks passed
@Hugo0
Copy link
Owner

Hugo0 commented Feb 26, 2026

@HaleyHalcyon awesome pr. fixed it up a bit and merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants