Add MaskablePPOPlayer by zarns · Pull Request #297 · bcollazo/catanatron

zarns · 2024-11-16T20:58:07Z

Supersedes #287

I added the SubprocVecEnv to allow multiple games to be played at once, so training data is captured about 5x faster. I trained for 4.96 days straight (100,000,000 timesteps) with this configuration and the model.zip file is 1525MB (too big to upload to git, unfortunately). After 5 days of training, PPOPlayer has an 8% win rate against AB-pruning and an 11% win rate against ValueFunctionPlayer. Attached is the wandb graph output. You can see that the episode_reward_mean is not slowing down, but it's simply not training fast enough on my RTX 4070 to realistically surpass the AB-pruning player. Perhaps the model has too many layers, slowing down training, but I've played around quite a bit with different hyperparameters and model sizes and this is the best I've come up with.

The features_extractor CNN doesn't seem to help much in training shorter runs even with much smaller model sizes. I'm starting to think stablebaselines isn't the best way to go. AlphaZero uses a combo of MCTS with this actor/critic neural net, and maybe we need to pursue recreating it for Catan.

Note that if you want to pull the branch and play around with it, you'll have to delete the model.zip before each run to reset the architecture.

netlify · 2024-11-16T20:58:11Z

‼️ Deploy request for catanatron-staging rejected.

Name	Link
🔨 Latest commit	`bc2ec10`

zarns · 2024-11-16T21:00:54Z

Looks like the build fails anyway bc the sb3_contrib requirements aren't met. We could just leave this as an open pull request too, I guess

Add MaskablePPOPlayer

bc2ec10

zarns mentioned this pull request Jan 16, 2025

Vectorizable? Spawn Multiple Environments? #299

Closed

bcollazo mentioned this pull request Dec 20, 2025

[WIP] PPO Example and RL Action Space Refactor #341

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MaskablePPOPlayer#297

Add MaskablePPOPlayer#297
zarns wants to merge 1 commit intobcollazo:masterfrom
zarns:feature/ppo

zarns commented Nov 16, 2024

Uh oh!

netlify bot commented Nov 16, 2024 •

edited

Loading

Uh oh!

zarns commented Nov 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zarns commented Nov 16, 2024

Uh oh!

netlify bot commented Nov 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

‼️ Deploy request for catanatron-staging rejected.

Uh oh!

zarns commented Nov 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

netlify bot commented Nov 16, 2024 •

edited

Loading