From 6fa4c6e233475d1b05c14175428faa12ab814165 Mon Sep 17 00:00:00 2001 From: soft-ke Date: Tue, 18 Nov 2025 15:49:48 -0800 Subject: [PATCH 1/7] Update README.md part 1 - more incoming --- README.md | 47 ++++++++++++++++++++++++----------------------- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 7fbd2e9d6..d09b009a5 100644 --- a/README.md +++ b/README.md @@ -1,34 +1,34 @@ -# CoGames: Cogs vs Clips Multi-Agent RL Environment +# CoGames: A Game Environment for the Alignment League Benchmark -CoGames is a collection of multi-agent cooperative and competitive environments designed for reinforcement learning -research. +CoGames is the game environment for Softmax’s [Alignment League Benchmark (ALB)](https://www.softmax.com/alignmentleague) — a suite of multi-agent games designed to measure how well AI agents align, coordinate, and collaborate with others (both AIs and humans). -## The game: Cogs vs Clips +The first ALB game, Cogs vs Clips, is implemented entirely within the CoGames environment. -Multiple "Cog" agents, controlled by user-provided policies, must cooperate to extract Hearts from the environment. -Doing so requires gathering resources, operating machinery, and assembling components. Many steps will require -interacting with a "station". Many such interactions will require multiple cogs working in tandem. +## The game: Cogs vs Clips -Your Cogs' efforts may be thwarted by Clips: NPC agents that disable stations or otherwise impede progress. +Cogs vs Clips is a cooperative production-and-survival game where teams of AI agents (“Cogs”) work together on the asteroid Machina VII. Their mission: Produce and protect **HEARTs** (Holon Enabled Agent Replication Templates) by gathering resources, operating machinery, and assembling components. Success is impossible alone! Completing these missions requires multiple cogs working in tandem.

Example Cogs vs Clips video
-There are many mission configurations available, with different map sizes, resource and station layouts, and game rules. -Overall, Cogs vs Clips aims to present rich environments with: +There are many mission configurations available, with different map sizes, resource and station layouts, and game rules. Cogs should refer to their [MISSION.md](MISSION.md) for a thorough description of the game mechanics. Overall, Cogs vs Clips aims to present rich environments with: - **Resource management**: Energy, materials (carbon, oxygen, germanium, silicon), and crafted components - **Station-based interactions**: Different stations provide unique capabilities (extractors, assemblers, chargers, chests) - **Sparse rewards**: Agents receive rewards only upon successfully crafting target items (hearts) - **Partial observability**: Agents have limited visibility of the environment -- **Required multi-agent cooperation**: Agents must coordinate to efficiently use shared resources and stations +- **Required multi-agent cooperation**: Agents must coordinate to efficiently use shared resources and stations, while only communicating through movement and emotes (❤️, 🔄, 💯, etc.) + +Once your policy is successfully assembling hearts, submit it to our Alignment League Benchmark. ALB evaluates how your policy plays with other policies in the pool through running multi-policy, multi-agent games. Our focal metric is VORP (Value Over Replacement Policy), an estimate of how much your agent improves team performance in scoring hearts. -Cogs should refer to their [MISSION.md](MISSION.md) for a thorough description of the game mechanics. +You will need to link a Github account. After submission, you will be able to view results on how your policy performed in various evals with other players by logging in on the [ALB page](https://www.softmax.com/alignmentleague). ## Quick Start +Upon installation, try playing cogames with our default starter policies as Cogs. Use `cogames policies` to see a full list of default policies. + ```bash # Install uv pip install cogames @@ -42,16 +42,19 @@ cogames play -m training_facility_1 -p random # Train a policy in that environment using an out-of-the-box, stateless network architecture cogames train -m training_facility_1 -p stateless -# Watch or play along side your trained policy +# Watch or play alongside your trained policy cogames play -m training_facility_1 -p stateless:train_dir/policy.pt # Evaluate how your policy performs on a different mission cogames eval -m machina_1 -p stateless:./train_dir/policy.pt + +# Submit your trained policy to see how it plays with other AI agents +cogames submit -p myPolicy ``` ## Commands -Most commands are of the form `cogames -p [MISSION] -p [POLICY] [OPTIONS]` +Most commands are of the form `cogames -m [MISSION] -p [POLICY] [OPTIONS]` To specify a `MISSION`, you can: @@ -69,16 +72,15 @@ To specify a `POLICY`, provide an argument with up to three parts `CLASS[:DATA][ Lists all missions and their high-level specs. -If a mission is provided, it describe a specific mission in detail. +If a mission is provided, describes a specific mission in detail. ### `cogames play -m [MISSION] -p [POLICY]` Play an episode of the specified mission. -**Policy** Cogs' actions are determined by the provided policy, except if you take over their actions manually. +Cogs' actions are determined by the provided policy, except if you take over their actions manually. -If not specified, this command will use the `noop`-policy agent -- do not be surprised if when you play you don't see -other agents moving around! Just provide a different policy, like `random`. +If not specified, this command will use the `noop`-policy agent -- do not be surprised if when you play you don't see other agents moving around! Just provide a different policy, like `random`. **Options:** @@ -93,7 +95,7 @@ and manually play alongside them. Train a policy on a mission. -**Policy** By default, our `stateless` policy architecture will be used. But as is explained above, you can select a +By default, our `stateless` policy architecture will be used. But as is explained above, you can select a different policy architecture we support out of the box (like `lstm`), or can define your own and supply a path to it. Any policy provided must implement the `TrainablePolicy` interface, which you can find in @@ -105,7 +107,7 @@ You can continue training an already-initialized policy by also supplying a path cogames train -m [MISSION] -p path/to/policy.py:train_dir/my_checkpoint.pt ``` -**Mission** Note that you can supply repeated `-m` missions. This yields a training curriculum that rotates through +Note that you can supply repeated `-m` missions. This yields a training curriculum that rotates through those environments: ``` @@ -189,10 +191,9 @@ for step in range(1000): ### `cogames eval -m [MISSION] [-m MISSION...] -p POLICY [-p POLICY...]` -Evaluate one or more policies on one more more missions +Evaluate one or more policies on one or more missions. -**Policy** Note that here, you can provide multiple `-p POLICY` arguments if you want to run evaluations on mixed-policy -populations. +You can provide multiple `-p POLICY` arguments if you want to run evaluations on mixed-policy populations. **Examples:** From f632618368c3b5b8672d00d865480cd1ce370dbf Mon Sep 17 00:00:00 2001 From: soft-ke Date: Wed, 19 Nov 2025 11:28:44 -0800 Subject: [PATCH 2/7] Update README.md --- README.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index d09b009a5..8368cc5ad 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ CoGames is the game environment for Softmax’s [Alignment League Benchmark (ALB)](https://www.softmax.com/alignmentleague) — a suite of multi-agent games designed to measure how well AI agents align, coordinate, and collaborate with others (both AIs and humans). -The first ALB game, Cogs vs Clips, is implemented entirely within the CoGames environment. +The first ALB game, Cogs vs Clips, is implemented entirely within the CoGames environment. You can create your own policy and submit it to our benchmark/pool. ## The game: Cogs vs Clips @@ -30,7 +30,12 @@ You will need to link a Github account. After submission, you will be able to vi Upon installation, try playing cogames with our default starter policies as Cogs. Use `cogames policies` to see a full list of default policies. ```bash -# Install +# We recommend using a virtual env +brew install uv +uv venv .venv +source .venv/bin/activate + +# Install cogames uv pip install cogames # List available missions @@ -49,7 +54,7 @@ cogames play -m training_facility_1 -p stateless:train_dir/policy.pt cogames eval -m machina_1 -p stateless:./train_dir/policy.pt # Submit your trained policy to see how it plays with other AI agents -cogames submit -p myPolicy +cogames submit -p cogames.MyPolicy --name my-first-policy ``` ## Commands From 39d05f3581891d34b32ae948525c8e60eb9407a7 Mon Sep 17 00:00:00 2001 From: soft-ke Date: Wed, 19 Nov 2025 11:30:51 -0800 Subject: [PATCH 3/7] Update README.md --- README.md | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 8368cc5ad..e2c88b458 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,22 @@ Lists all missions and their high-level specs. If a mission is provided, describes a specific mission in detail. +### `cogames variants` + +Lists available variants for modifying missions + +### `cogames evals` + +Lists all missions used as evals for analyzing the behaviour of agents + +### `cogames policies` + +Shows a list of default policies available to you, and the shorthands with which you can use them. + +### `cogames version` + +Show version info for mettagrid, pufferlib-core, and cogames. + ### `cogames play -m [MISSION] -p [POLICY]` Play an episode of the specified mission. @@ -234,13 +250,6 @@ modifications. You will be able to provide your specified `--output` path as the `MISSION` argument to other `cogames` commmands. -### `cogames version` - -Show version info for mettagrid, pufferlib-core, and cogames. - -### `cogames policies` - -Shows a list of default policies available to you, and the shorthands with which you can use them. ## Citation From 35aee583c77140b6e4dbd29f125c63a8febf4111 Mon Sep 17 00:00:00 2001 From: soft-ke Date: Wed, 19 Nov 2025 12:02:53 -0800 Subject: [PATCH 4/7] Update README.md --- README.md | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e2c88b458..ca1c96978 100644 --- a/README.md +++ b/README.md @@ -222,6 +222,9 @@ You can provide multiple `-p POLICY` arguments if you want to run evaluations on # Evaluate a single trained policy checkpoint cogames eval -m machina_1 -p stateless:train_dir/model.pt +# Evaluate a single trained policy across a mission set with multiple agents +cogames eval -set integrated_evals -p stateless:train_dir/model.pt + # Mix two policies: 3 parts your policy, 5 parts random policy cogames eval -m machina_1 -p stateless:train_dir/model.pt:3 -p random::5 ``` @@ -250,16 +253,30 @@ modifications. You will be able to provide your specified `--output` path as the `MISSION` argument to other `cogames` commmands. +## Policy Submission +### `cogames login` + +Make sure you have authenticated before submitting a policy. + +### `cogames submit -p [POLICY] -n [NAME]` + +**Options:** +- `--include-files`: Can be specified multiple times, such as --include-files file1.py --include-files dir1/ +- `–-dry-run`: Validates the policy works for submission without uploading it + +When a new policy is submitted, it is queued up for evals with other policies, both randomly selected and designated policies for the Alignment League Benchmark. + +Visit the [ALB](https://www.softmax.com/alignmentleague) page and log in to see how your policies perform! ## Citation If you use CoGames in your research, please cite: ```bibtex -@software{cogames2024, +@software{cogames2025, title={CoGames: Multi-Agent Cooperative Game Environments}, author={Metta AI}, - year={2024}, + year={2025}, url={https://github.com/metta-ai/metta} } ``` From 4944b5d4be043917f02ec9c3ef186a1ed33dcc80 Mon Sep 17 00:00:00 2001 From: soft-ke Date: Wed, 19 Nov 2025 12:39:55 -0800 Subject: [PATCH 5/7] Update README.md --- README.md | 50 ++++++++++++++------------------------------------ 1 file changed, 14 insertions(+), 36 deletions(-) diff --git a/README.md b/README.md index ca1c96978..9d648a01f 100644 --- a/README.md +++ b/README.md @@ -41,29 +41,29 @@ uv pip install cogames # List available missions cogames missions -# Play an episode of the training_facility_1 mission -cogames play -m training_facility_1 -p random +# Describe a specific mission in detail +cogames missions -m [MISSION] -# Train a policy in that environment using an out-of-the-box, stateless network architecture -cogames train -m training_facility_1 -p stateless +# List available variants for modifying missions +cogames variants -# Watch or play alongside your trained policy -cogames play -m training_facility_1 -p stateless:train_dir/policy.pt +# List all missions used as evals for analyzing the behaviour of agents +cogames evals -# Evaluate how your policy performs on a different mission -cogames eval -m machina_1 -p stateless:./train_dir/policy.pt +# Shows all policies available and their shorthands +cogames policies -# Submit your trained policy to see how it plays with other AI agents -cogames submit -p cogames.MyPolicy --name my-first-policy +# Show version info +cogames version ``` -## Commands +## Play, Train, and Eval Most commands are of the form `cogames -m [MISSION] -p [POLICY] [OPTIONS]` To specify a `MISSION`, you can: -- Use a mission name from the default registry emitted by `cogames missions`, e.g. `training_facility_1` +- Use a mission name from the registry given by `cogames missions`, e.g. `training_facility_1` - Use a path to a mission configuration file, e.g. path/to/mission.yaml" To specify a `POLICY`, provide an argument with up to three parts `CLASS[:DATA][:PROPORTION]`: @@ -73,28 +73,6 @@ To specify a `POLICY`, provide an argument with up to three parts `CLASS[:DATA][ - `DATA`: Optional path to a weights file or directory. When omitted, defaults to the policy's built-in weights. - `PROPORTION`: Optional positive float specifying the relative share of agents that use this policy (default: 1.0). -### `cogames missions -m [MISSION]` - -Lists all missions and their high-level specs. - -If a mission is provided, describes a specific mission in detail. - -### `cogames variants` - -Lists available variants for modifying missions - -### `cogames evals` - -Lists all missions used as evals for analyzing the behaviour of agents - -### `cogames policies` - -Shows a list of default policies available to you, and the shorthands with which you can use them. - -### `cogames version` - -Show version info for mettagrid, pufferlib-core, and cogames. - ### `cogames play -m [MISSION] -p [POLICY]` Play an episode of the specified mission. @@ -241,7 +219,7 @@ their assignments each episode. ### `cogames make-mission -m [BASE_MISSION]` -Create custom mission configuration. In this case, the mission provided is the template mission to which you'll apply +Create a custom mission configuration. In this case, the mission provided is the template mission to which you'll apply modifications. **Options:** @@ -251,7 +229,7 @@ modifications. - `--height H`: Map height (default: 10) - `--output PATH`: Save to file -You will be able to provide your specified `--output` path as the `MISSION` argument to other `cogames` commmands. +You will be able to provide your specified `--output` path as the `MISSION` argument to other `cogames` commands. ## Policy Submission ### `cogames login` From 7e851398d01febdf9441cd3b5c77489bac2cac4e Mon Sep 17 00:00:00 2001 From: soft-ke Date: Wed, 19 Nov 2025 13:07:46 -0800 Subject: [PATCH 6/7] Update README.md --- README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 9d648a01f..6d5f8ac9b 100644 --- a/README.md +++ b/README.md @@ -63,13 +63,13 @@ Most commands are of the form `cogames -m [MISSION] -p [POLICY] [OPTIO To specify a `MISSION`, you can: -- Use a mission name from the registry given by `cogames missions`, e.g. `training_facility_1` -- Use a path to a mission configuration file, e.g. path/to/mission.yaml" +- Use a mission name from the registry given by `cogames missions`, e.g. `training_facility_1`. +- Use a path to a mission configuration file, e.g. `path/to/mission.yaml`. +- Alternatively, specify a set of missions with `-set` or `-S`. To specify a `POLICY`, provide an argument with up to three parts `CLASS[:DATA][:PROPORTION]`: -- `CLASS`: Policy shorthand (`noop`, `random`, `lstm`, `stateless`) or fully qualified class path like - `cogames.policy.random.RandomPolicy`. Use `cogames policies` to see a full list of default policies. +- `CLASS`: Use a policy shorthand or full path from the registry given by `cogames policies`, e.g. `lstm` or `cogames.policy.random.RandomPolicy`. - `DATA`: Optional path to a weights file or directory. When omitted, defaults to the policy's built-in weights. - `PROPORTION`: Optional positive float specifying the relative share of agents that use this policy (default: 1.0). @@ -190,7 +190,9 @@ for step in range(1000): ### `cogames eval -m [MISSION] [-m MISSION...] -p POLICY [-p POLICY...]` -Evaluate one or more policies on one or more missions. +Evaluate one or more policies on one or more missions. + +We provide a set of eval missions which you can use instead of missions `-m`. Specify `-set` or `-S` among: `eval_missions`, `integrated_evals`, `spanning_evals`, `diagnostic_evals`, `all`. You can provide multiple `-p POLICY` arguments if you want to run evaluations on mixed-policy populations. From 9b1ee3b21675d2978fa7f4e2f8867c5f8999563d Mon Sep 17 00:00:00 2001 From: Rafael Irgolic Date: Mon, 1 Dec 2025 10:24:17 +0900 Subject: [PATCH 7/7] README: fix typo in code example description --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6d5f8ac9b..b4d07c80d 100644 --- a/README.md +++ b/README.md @@ -129,7 +129,7 @@ You can also specify multiple missions with `*` wildcards: ### Custom Policy Architectures To get started, `cogames` supports some torch-nn-based policy architectures out of the box (such as StatelessPolicy). To -supply your own, you will want to extend `cogames.policy.Policy`. +supply your own, you will want to extend `mettagrid.policy.policy.MultiAgentPolicy`. ```python from mettagrid.policy.policy import MultiAgentPolicy as Policy