Skip to content

deterministic discards#362

Merged
bcollazo merged 11 commits intobcollazo:mainfrom
SY3141:Deterministic_Discards
Mar 31, 2026
Merged

deterministic discards#362
bcollazo merged 11 commits intobcollazo:mainfrom
SY3141:Deterministic_Discards

Conversation

@SY3141
Copy link
Copy Markdown
Contributor

@SY3141 SY3141 commented Mar 20, 2026

implemented deterministic discards and updated the web UI to allow for resources to be selected to be discarded one by one.

#361

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 20, 2026

👷 Deploy request for catanatron-staging pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit d0f47b1

@SY3141 SY3141 force-pushed the Deterministic_Discards branch 4 times, most recently from d6c285f to 1284789 Compare March 21, 2026 04:26
@SY3141 SY3141 force-pushed the Deterministic_Discards branch from ba3b429 to 9ee99a7 Compare March 21, 2026 04:40
Copy link
Copy Markdown
Owner

@bcollazo bcollazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, pretty cool! Thanks for opening this PR!

Please don't be discouraged if we go back and forth a bit on the PR. I want to make sure I understand it fully and it goes in the direction I want for the codebase.

Could you include a video demo or so of the feature at work? Also explain a bit more how it may work? Maybe add a couple more examples like test_discard_possibilities_are_per_resource. Its a good example, but a simple one. I'd like to see a couple more to fully understand the picture. Thanks!

Comment thread catanatron/catanatron/models/enums.py Outdated

# TODO: None for now to avoid complexity, but should be Resource[].
DISCARD = "DISCARD" # value is None
DISCARD = "DISCARD" # value is Resource
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename the action all together to DISCARD_RESOURCE? I think its going to help discern from a regular DISCARD.

Copy link
Copy Markdown
Contributor Author

@SY3141 SY3141 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a grep search for every file where this action type is found and made the change

Comment thread catanatron/catanatron/web/models.py Outdated
abort(404)
db.session.commit()
game = pickle.loads(result.pickle_data) # type: ignore
game.state._state_index = result.state_index
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Comment thread catanatron/catanatron/game.py Outdated
self.state = State(players, catan_map, discard_limit=discard_limit)
self.playable_actions = generate_playable_actions(self.state)

def __setstate__(self, state):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Comment thread catanatron/catanatron/state.py Outdated

def __setstate__(self, state):
self.__dict__ = state
if not hasattr(self, "action_records"):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what we are trying to do here. Save old games? I rather have the simplicity in the code and have users treat games ephemerally (or tied to a version of the codebase).

Copy link
Copy Markdown
Contributor Author

@SY3141 SY3141 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was a patch for a database issue with the --step-db CLI flag that has since been resolved. Going to remove these above 3 legacy functions

Comment thread tests/test_json.py
assert action.value == (SHEEP,)


def test_action_from_json_discard():
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@bcollazo
Copy link
Copy Markdown
Owner

Also, be sure to rebase and address any and all CI issues. Really want this to get in; I think its a strict improvement to what we have in place, and its pretty much needed for the UI to be usable.

@SY3141 SY3141 force-pushed the Deterministic_Discards branch from 69f97b0 to 68722f0 Compare March 23, 2026 03:14
@bcollazo
Copy link
Copy Markdown
Owner

bcollazo commented Mar 23, 2026

Also, rebase or repoint the PR against bcollazo:main! master no longer! 👍 Thanks.

@SY3141
Copy link
Copy Markdown
Contributor Author

SY3141 commented Mar 23, 2026

Also, rebase or repoint the PR against bcollazo:main! master no longer! 👍 Thanks.

should be pointing to main with this rebase: 99657b3. Let me know if I'm mistaken

@bcollazo
Copy link
Copy Markdown
Owner

Hey, I think something is still off. I still see its against "bcollazo:master", and the diff seems to suggest you'll introduce those changes in the snapshot (not related at all to deterministic cards). Feel free to close this one and re-open another if its easier! 👍

Screenshot 2026-03-24 202642

@SY3141 SY3141 changed the base branch from master to main March 25, 2026 17:56
Copy link
Copy Markdown
Owner

@bcollazo bcollazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the work here. I have some changes I'd like to make before we merge. Please take a look! Thanks!

# Preserve historical DISCARD ordering so the rename does not reshuffle
# integer action ids for gym consumers.
return str(action).replace("DISCARD_RESOURCE", "DISCARD")

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for backwards compatibility as well? I wouldn't invest in it in the repo. I rather have it simple and treat it as a breaking change.

Copy link
Copy Markdown
Contributor Author

@SY3141 SY3141 Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, this was for a deep learning bot in another repo trained on the 290 sized action space to work with the new 294 sized action space. Can remove this for the main repo though

Comment on lines +281 to +289
def discard_possibilities(state: State, color) -> List[Action]:
if state.discard_counts[color] <= 0:
return []

return [
Action(color, ActionType.DISCARD_RESOURCE, resource)
for resource in RESOURCES
if player_num_resource_cards(state, color, resource) > 0
]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. 👍

Comment thread catanatron/catanatron/models/enums.py Outdated
The "result" field is polymorphic depending on the action_type.
- ROLL: result is (int, int) 2 dice rolled
- DISCARD: result is List[Resource] discarded
- DISCARD_RESOURCE: result is List[Resource] discarded in this action
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result is a Resource* correct?

Comment thread catanatron/catanatron/apply_action.py Outdated
Comment on lines +307 to +319
def normalize_discarded_cards(state: State, action: Action, action_record=None):
if action.value is not None:
if isinstance(action.value, (list, tuple)):
return list(action.value)
return [action.value]

if action_record is not None and action_record.result is not None:
if isinstance(action_record.result, (list, tuple)):
return list(action_record.result)
return [action_record.result]

hand = player_deck_to_array(state, action.color)
num_to_discard = len(hand) // 2
if action_record is None:
# TODO: Forcefully discard randomly so that decision tree doesnt explode in possibilities.
discarded = random.sample(hand, k=num_to_discard)
else:
discarded = action_record.result # for replay functionality
return [random.choice(hand)]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this function? Not following. Can we simplify and have the .value of DISCARD_RESOURCE be a resource and that's it? Not a list.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why would we have to random.choice(hand) here? I think with this solution of discarding one at a time, we wouldn't need to random choose here, no?

Comment thread catanatron/catanatron/json.py Outdated
Comment on lines +35 to +40
if isinstance(value, list):
if len(value) != 1:
raise ValueError(
"Discard action must have 1 resource when encoded as a list"
)
value = value[0]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Sounds like simplifying the .value to always a resource would simplify this code too!

Comment thread catanatron/catanatron/state.py Outdated
Comment on lines +136 to +137
self.discard_counts: Dict[Color, int] = {color: 0 for color in self.colors}
self.discard_counts: Dict[Color, int] = {color: 0 for color in self.colors}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I seeing double? hehe

Comment on lines +60 to +71
def test_discard_possibilities_are_per_resource():
player = SimplePlayer(Color.RED)
state = State([player])
state.discard_counts[player.color] = 2

player_deck_replenish(state, player.color, WHEAT, 2)
player_deck_replenish(state, player.color, BRICK, 1)

assert discard_possibilities(state, player.color) == [
Action(player.color, ActionType.DISCARD_RESOURCE, BRICK),
Action(player.color, ActionType.DISCARD_RESOURCE, WHEAT),
]
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Can you add a couple more tests of this nature?

Comment thread ui/src/utils/promptUtils.ts Outdated
} from "./api.types";
import type { GameState } from "./api.types";

export function humanizeAction(gameState: GameState, action: GameAction) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. we shouldn't need this function. Everything in the log should be ActionRecords.

@bcollazo
Copy link
Copy Markdown
Owner

Ahh, finally the checks were able to run. I think it may have been a transient error on Github's part? Anyways, let me know if you have any questions about CI checks and how to make them all green. 👍

@coveralls
Copy link
Copy Markdown

coveralls commented Mar 31, 2026

Pull Request Test Coverage Report for Build 23776692959

Details

  • 42 of 42 (100.0%) changed or added relevant lines in 5 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.08%) to 93.98%

Files with Coverage Reduction New Missed Lines %
catanatron/catanatron/players/tree_search_utils.py 1 94.59%
Totals Coverage Status
Change from base Build 23773265000: 0.08%
Covered Lines: 3294
Relevant Lines: 3505

💛 - Coveralls

@bcollazo
Copy link
Copy Markdown
Owner

Awesome. Thank you so much for taking on this work! Makes it a lot more usable and representative. 👍

@bcollazo bcollazo merged commit 41ba0db into bcollazo:main Mar 31, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants