This repository accompanies the paper “Assisted Governance with LLM-based Algorithmic Committees: Reliability, Systematic Divergence, and a Tiered-Trust Workflow.” It contains the full simulation corpus, preprocessing scripts, and analysis code that generate every table, figure, and robustness check reported in the manuscript and supplementary information.
Large language models (LLMs) are instantiated as seven governance personas (P1–P7) and evaluated across six frontier models on 508 Nouns DAO proposals, yielding 21,336 structured judgments plus a 17k-run repeatability audit. The analyses show:
- Instruction-dominant behavior – persona prompts reliably steer model behavior, creating programmable cognitive diversity.
- Systematic divergence from humans – algorithmic committees expose principled counter-perspectives instead of imitating historical DAO outcomes.
- Tiered trustworthiness – textual reasons and multi-criteria scores are stable, while final votes are sensitive to sampling noise, motivating human escalation for low-trust outputs.
The repository lets readers recreate the full pipeline: preprocessing, Stages 1–4 analyses, committee synthesis, divergence diagnostics, and stability checks.
├── README.md
├── LICENSE # MIT
├── paper/
│ ├── Governance.pdf
├── data/
│ ├── raw/
│ │ ├── simulation_results_*.jsonl
│ │ └── proposal_final_v1_newcategory.json
│ └── processed/
│ ├── analysis_dataset_full_new.parquet
│ ├── analysis_dataset_processed.parquet
│ └── analysis_dataset_processed.jsonl # optional textual mirror
├── outputs/
│ └── paper_run/ # single canonical set per stage
├── src/
│ ├── preprocess_main_dataset.py
│ ├── stage1_analyse_stage1.py
│ ├── stage2_analyse_stage2.py
│ ├── stage3_1_comparison_metrics.py
│ ├── stage3_2_case_prep.py
│ ├── stage3_3_feature_impact.py
│ ├── stage3_4_environmental.py
│ ├── stage3_5_persona_p7.py
│ ├── stage4_1_hypothesis_H1_H3.py
│ ├── stage4_1_hypothesis_H4_H6.py
│ ├── stage4_2_4_3_predictive_modeling.py
│ ├── analyse_committee_decision.py
│ ├── analyse_divergence_drivers.py
│ └── analyse_stability_check.py
├── requirements.txt
└── .gitignore # ignore outputs/*, dao_analysis_results/*, .nltk_data/
| File | Description |
|---|---|
data/raw/simulation_results_*.jsonl |
Raw LLM transcripts covering seven personas × six models × 508 proposals (primary corpus + repeatability audit) |
data/raw/proposal_final_v1_newcategory.json |
Curated metadata for all proposals and category assignments |
data/processed/analysis_dataset_full_new.parquet |
Consolidated table prior to cleaning |
data/processed/analysis_dataset_processed.parquet |
Final dataset used throughout the paper (21,336 rows × 85 columns) |
Privacy note: Ethereum addresses and proposal metadata are public on-chain records; remove any additional annotations that should remain private before releasing.
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Stage 2 sentiment analysis needs these NLTK resources
python -m nltk.downloader vader_lexicon stopwords punktAdditional dependency: sentence-transformers (used in analyse_stability_check.py; caches model weights locally on first run).
The scripts are intentionally modular so each stage can be rerun independently. The canonical order mirrors the paper:
- Preprocessing – construct the processed dataset used everywhere.
python src/preprocess_main_dataset.py
- Stage 1 – exploratory data analysis and integrity checks.
python src/stage1_analyse_stage1.py
- Stage 2 – multidimensional agreement analysis, PCA, sentiment diagnostics.
python src/stage2_analyse_stage2.py
- Stage 3 – proposal-level comparisons, case extraction, feature & environmental impact, persona deep dive.
python src/stage3_1_comparison_metrics.py python src/stage3_2_case_prep.py python src/stage3_3_feature_impact.py python src/stage3_4_environmental.py python src/stage3_5_persona_p7.py
- Stage 4 – hypothesis testing (H1–H6) and predictive modeling.
python src/stage4_1_hypothesis_H1_H3.py python src/stage4_1_hypothesis_H4_H6.py python src/stage4_2_4_3_predictive_modeling.py
- Supplementary analyses – algorithmic committee synthesis, divergence drivers, and stability check.
python src/analyse_committee_decision.py python src/analyse_divergence_drivers.py python src/analyse_stability_check.py
Optionally, build an orchestration script (e.g., scripts/run_all.sh) chaining the commands above for one-click reproduction.
- Please open GitHub issues for questions about the code or data.
- Pull requests are welcome; include reproduction notes or tests when touching analysis scripts.
- Respect the data-use policy described above when distributing raw simulation transcripts.
For inquiries regarding the research study, refer to the corresponding author listed in paper/Governance.pdf. For technical questions about the code release, open an issue once the repository is public.