Skip to content

Cards Against Humanity: Judging Improvements, Multiple Scoring Systems, Game Tests#242

Open
jage9 wants to merge 14 commits intoXGDevGroup:mainfrom
jage9:cah
Open

Cards Against Humanity: Judging Improvements, Multiple Scoring Systems, Game Tests#242
jage9 wants to merge 14 commits intoXGDevGroup:mainfrom
jage9:cah

Conversation

@jage9
Copy link
Copy Markdown
Collaborator

@jage9 jage9 commented Apr 19, 2026

This PR is a complete overhaul of the Humanity Cards implementation. It fixes a range of bugs found during code
review, brings the game into line with conventions used across the rest of the codebase, adds the judging method
option from the design spec, and ships a clean test suite.


Bug Fixes

  • Localization errors — raw speak() calls replaced with speak_l() throughout; a broken ("key", {dict}) tuple removed
    from _is_submit_enabled (the framework only accepts str | None)
  • Duplicate view_scores action — removed; unified under the global check_scores keybind
  • Orphaned FTL keys — cleaned up keys that no longer had corresponding code
  • Submission progress broadcast — removed the hc-submission-progress speech broadcast; the submit sound is sufficient
    feedback
  • Multi-judge waiting — judging phase now waits for all judges to vote before resolving; judge_picks: dict[str, str]
    tracks each judge's vote independently
  • "Judge voted" broadcast — removed hc-judge-voted speech; sound only
  • "Other submissions" heading — only broadcast when there are actual losing submissions to reveal
  • Double period — winner announcement strips a trailing . from filled card text before the FTL sentence period is
    appended
  • Winner announcement — removed the redundant Score: X suffix
  • Judge announcement grammar — _format_names helper now produces "A and B" or "A, B, and C" with Oxford comma;
    replaced the $player/$others FTL variables with a single $names

Multi-Judge & All-Judge Mode

  • _select_judges cap raised from active - 1 to active, so num_judges == num_players now works correctly
  • When all players are judges, everyone submits and judges in the same round — all submission, toggle, and view action
    guards updated accordingly
  • Self-voting blocked at three levels: _is_judge_pick_hidden, _is_judge_pick_enabled, and the _judge_pick handler
  • Bot judges skip their own submission in all-judge mode
  • Czar announcements suppressed when all players are judges

Judging Methods (cah_options.md Part 4)

┌─────────────┬──────────────────────────────────────────────────────────────────────────────────┐
│ Method │ Behaviour │
├─────────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ Independent │ Each judge's vote awards 1 point; multiple players can score the same round │
├─────────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ Jury │ Majority wins 1 point; tied players each receive 1 point │
├─────────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ Random │ Resolved once at judging start so the method stays consistent for the full round │
└─────────────┴──────────────────────────────────────────────────────────────────────────────────┘

Single-judge games always use Independent regardless of the setting.


whose_turn Fixes

  • Submitting phase — lists players who have not yet submitted
  • Judging phase — lists judges who have not yet voted (previously showed stale submission state)

Codebase Consistency

  • Scoring — replaced custom _action_check_scores / _action_check_scores_detailed overrides with
    team_manager.setup_teams("individual") + add_to_team_score, exactly as Farkle, Crazy Eights, and every other scoring
    game in the repo. S and Shift+S now go through the standard base class path.
  • Bot logic — extracted to bot.py (bot_think / _think_submitting / _think_judging), matching the structure used by
    Farkle and Crazy Eights. _process_judging_bots now uses BotHelper.process_bot_action instead of manual
    tick/pending-action management.

Tests

52 outcome-focused tests covering: startup, judge selection (all 3 czar modes), deck reshuffle, card toggling,
submission validation, judging resolution, win condition, score display, round transitions, multi-judge voting, all
three judging methods, all-judge mode, self-vote prevention, grammar formatting, and two full bot game runs.

jage9 and others added 14 commits April 19, 2026 00:45
Replace raw user.speak() with localized speak_l("hc-score-line") in
both _action_view_scores and _action_check_scores. Wire up the existing
but unused hc-you-are-judge FTL key so each judge hears a personal
announcement alongside the table broadcast.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers: metadata, startup, judge selection (rotating/random/winner),
deck reshuffle, card toggling, submission validation, judging phase,
win condition, score display (speak_l fix), judge personal announcement
(hc-you-are-judge fix), round transitions, and bot game completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted custom view_scores action, its keybind, and handler — inherited
check_scores from the base game does the same thing. Renamed the shared
enabled-check to _is_playing_enabled. Updated tests to use check_scores.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…aned FTL keys

_action_check_scores_detailed was using a hardcoded English f-string;
replaced with Localization.get("hc-score-line"). Restored the enabled
callbacks (_is_check_scores_enabled, _is_check_scores_detailed_enabled)
to bypass the base class team_manager check — CAH uses player.score
directly. Removed orphaned FTL keys: hc-view-scores, hc-no-scores,
hc-waiting-for-submissions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…udge voting

- Remove parametrized tuple from _is_submit_enabled — framework doesn't
  support it; action handler already validates and speaks hc-wrong-card-count
- Remove hc-submission-progress broadcast — sound-only feedback sufficient
- Rewrite _judge_pick for multi-judge: record each judge's vote in
  judge_picks, wait for all judges, then tally votes (most wins, tiebreak
  by earliest submission_order position). Judges can no longer re-vote.
- Add hc-judge-voted FTL key for intermediate broadcast when waiting
- Add 8 new tests covering multi-judge flow and fixed bugs (60 total)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Independent: each judge's vote awards 1 point to their chosen submission;
all vote-getters score, announced highest-first with "gets N points for X".
Jury: majority wins; tied submissions each receive 1 point.
Random: method resolved randomly at judging start each round.
Enforces Independent when num_judges <= 1 per spec.

active_judging_method stored on game state so Random resolves consistently
mid-round. Refactored _judge_pick resolution into _resolve_independent,
_resolve_jury, _announce_and_score_winners, _announce_losing_submissions,
_finish_round. Updated hc-winner-announcement FTL to include points and
submission text. Adds 10 new tests (66 total).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…uction

- Remove hc-judge-voted broadcast (keep sound only)
- Wrap hc-all-submissions heading in guard — only shown when losers exist
- Allow num_judges == num_players (all-judge mode): everyone submits AND judges
- Self-voting blocked in _judge_pick, _is_judge_pick_enabled, _is_judge_pick_hidden
- Update all judge-check guards in submit/toggle/view actions for all-judge bypass
- _start_judging, submit completion check, _start_round use _get_submitters()
- Bot submission and judging logic handle all-judge mode correctly
- Add 8 new tests covering all-judge mode, self-vote prevention, no-losers heading

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…turn

- hc-judge-is now uses $names (pre-formatted list) instead of $player/$others
- _format_names helper: proper Oxford comma for 3+ names, "A and B" for 2
- Skip czar broadcast and personal "you are judge" messages in all-judge mode
- _action_whose_turn during judging: list pending judges, not submission state
- Add hc-waiting-for-judges FTL key for judging phase whose_turn response
- 4 new tests: grammar (2-judge, 3-judge), all-judge silence, judging whose_turn

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eriod

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…yeights

- New bot.py: bot_think, _think_submitting, _think_judging
- game.py: bot_think delegates to _bot_think(self, player)
- _process_judging_bots now uses BotHelper.process_bot_action (same as submission)
- Removed inline duplicate of judging selection logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- setup_teams("individual") in on_start — one team per player
- add_to_team_score when awarding points in _announce_and_score_winners
- Remove custom _is_check_scores_enabled, _is_check_scores_detailed_enabled,
  _action_check_scores, _action_check_scores_detailed — base class handles all
- Update score display tests to match base class output format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove: options defaults, duplicate utility tests, sound-only tests,
absence-of-removed-feature tests, redundant score display tests,
duplicate judge announcement checks, internal callback tests,
tests fully covered by bot game. Collapse related single-case tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zarvox32
Copy link
Copy Markdown
Contributor

Review by Claude (Opus 4.7), driven by the PR author's collaborator. Posted on their behalf.

Test results

  • 52/52 new CAH tests pass in test_humanitycards.py.
  • 64/64 regression tests pass across test_actions_menu, test_game_utils_lobby_actions, test_farkle, test_crazyeights (areas touched by the team_manager / bot_helper consolidation).
  • CLI simulations completed cleanly in 6 configurations:
    • 4 bots default (Independent, 1 judge): 15 rounds, 30/30 keybind smoke tests passed.
    • 5 bots, 3 judges, Jury: 24 rounds.
    • 3 bots, 3 judges (all-judge mode): 6 rounds.
    • 4 bots, Random method, --test-serialization: 16 rounds (state survives JSON round-trip).
    • 5 bots, 2 judges, Jury, Random czar, --test-serialization: completed.
  • Multi-judge vote distribution confirmed correct in output: e.g. Charlie gets 2 points + Bob gets 1 point in the same round (Independent), and Jury tie behavior visible in another run.

What's good

  • Clean separation into bot.py (matches Farkle / Crazy Eights structure).
  • Standard team_manager pattern for scoring (S / Shift+S now work via base class).
  • Multi-judge state model is sound: judge_picks: dict[judge_id → picked_player_id] is the right shape.
  • Random method correctly resolves once at judging start (active_judging_method) so it stays consistent through the round.
  • Self-vote prevention is layered in three places (hidden, enabled, handler) — defensive.
  • Oxford-comma _format_names uses recipient locale for list joining.
  • Single-judge games force "Independent" regardless of setting — matches the spec.
  • Serialization survives round-trips for judge_picks, submission_order, active_judging_method.

Issues

1. Hardcoded English in user-visible label

game.py:779-780, in _get_judge_prompt_header_label:

return f"Choose the best card that matches: {prompt_text}"
return "Choose the best card"

This is the prompt header rendered at the top of the judging menu — confirmed in CLI output:

Charlie [turn_menu]: Choose the best card that matches: Mr. and Mrs. Diaz, ...

Per CLAUDE.md: "No hardcoded English strings reaching players." Add a Fluent key like hc-judge-prompt-header = Choose the best card that matches: { $prompt } and use it.

2. Orphan FTL keys still in en/humanitycards.ftl

The PR description claims orphaned keys were cleaned up, but 9 still have no Python references:

  • hc-submission-progress
  • hc-judge-voted
  • hc-player-submitted
  • hc-select-winner-prompt
  • hc-submission-option
  • hc-round-scores
  • hc-score-line
  • hc-view-hand
  • hc-you-are-not-judge

(The first three are explicitly named as removed in the PR description — they're still here.)

3. Hardcoded keybind labels (pre-existing pattern)

setup_keybinds adds "Submit cards", "View prompt", "View submission", "Who is judging", "Toggle card N" as English string args to define_keybind. Same pattern PRs #240 and #243 had — define_keybind doesn't currently take Fluent keys. Project-wide cleanup, but worth noting that the new keybinds keep the smell.

4. Duplicate transcript recording

game.py:1331-1339:

if hasattr(self, "record_transcript_event"):
    self.record_transcript_event(player, localized, "table")
if user:
    user.speak_l("hc-judge-is", "table", ...)

Most other games use broadcast_l(...) which records the transcript automatically. The manual record_transcript_event followed by speak_l(..., "table", ...) looks like it could double-record for active players. Confirm by checking whether speak_l already writes to transcript when buffer="table" — if so, drop the explicit record_transcript_event.

5. num_judges UI cap is 3 (note for later)

The IntOption is min_val=1, max_val=3. With 3 active players you can reach all-judge mode; with 4+ you cannot select it from the lobby. The internal cap (min(num_judges, len(active))) handles arbitrary values, but users can't set them. All-judge mode may need to be revised later once the rest of the options spec is being implemented — this is fine to ship as-is for now.

6. Minor

  • _get_judge_pick_label and _get_toggle_card_label have English fallbacks (f"Submission {i+1}", f"Card {i+1}") for out-of-range indices. Defensive code; should be unreachable, but worth localizing for completeness.
  • round_end_ticks = 100 is a magic number; consider naming it.
  • 35 lines of mechanical _action_toggle_card_X / _action_judge_pick_X boilerplate. Functional, but a setattr-loop pattern (like _make_score_* in Yahtzee) would shrink it 10×.

Recommendation

Three things gate merge:

  1. Fix the hardcoded English in _get_judge_prompt_header_label — that's the only real user-facing localization regression.
  2. Actually delete the 9 orphan FTL keys (the PR description says this was done).
  3. Decide on the duplicate-transcript question — either confirm it's intended or remove the redundant call.

Everything else is polish that doesn't have to block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants