Briscola-py

RL for briscola, using PPO

Play against the 4 player model on https://briscola.io !

Export model for other platforms: https://stable-baselines3.readthedocs.io/en/master/guide/export.html#export-to-onnx

Achievements

Best model

The best model trained here was in the 1v1 case. When briscola is played 1v1, the game becomes zero sum, making it better suited for PPO. The model achieved a clear advantage over random play.

The 4-player model slightly outperforms random play from each starting position, but not by a significant margin. I have observed it sort of throwing away points in early stages of the game, and consistently holding batons until the end of the game regardless of what the briscola suit is.

Takeaways

Basing the reward function on the number of points gained per tern was much more effective than a simple 1st/2nd/3rd place reward. Providing a negative reward when giving up points unnecessarily also made the model much more successful.

CFM may perform much better for the 4-player case given the similarity to poker.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
lib		lib
models		models
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
export_to_onnx.ipynb		export_to_onnx.ipynb
play_1v1.py		play_1v1.py
pyproject.toml		pyproject.toml
train_and_eval.ipynb		train_and_eval.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Briscola-py

Achievements

Best model

Takeaways

About

Uh oh!

Releases

Packages

Languages

tbeeck/briscola-py-rl

Folders and files

Latest commit

History

Repository files navigation

Briscola-py

Achievements

Best model

Takeaways

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages