This repository contains the source code for the paper "Genetic Algorithms for Tractable Bayesian Network Consensus: A Unified Treewidth-Constrained Framework".
Bayesian network structural fusion combines locally learned networks into a consensus model, which is central to federated and distributed learning. However, the unrestricted fusion creates dense graphs whose high treewidth makes exact inference intractable.
This repository implements a unified framework that tackles two complementary problems under a treewidth constraint:
- Fusion approximation (P1): Approximating the unrestricted fusion under a treewidth bound.
- Structural consensus (P2): Finding the consensus network most representative of the inputs under the same constraint.
Four genetic algorithm variants parameterize the search space differently:
- GA-$\mathcal{E}_a$: Pruning arcs from the post-fusion graph.
- GA-$\mathcal{E}_c$: Pruning unique arcs from the input networks before fusion.
- GA-$\mathcal{E}_b$: Pruning arcs with an independent decision per network.
- GA-$\mathcal{E}_b^{MC}$: A hybrid variant that uses a min-cut-based greedy method as a warm start to seed the GA-$\mathcal{E}_b$ population.
- Java 17
- Maven
To compile the project and build the executable JAR with dependencies, run:
mvn clean packageThis will generate a JAR file in the target/ directory (e.g., bnfusion-1.0-jar-with-dependencies.jar).
The main entry point for running the experiments is the consensusBN.Experiments.ExperimentsJournal class. This class runs all the algorithms (greedy and genetic variants) for a given network and parameters at different treewidth values.
Run the compiled JAR:
java -cp target/bnfusion-1.0-jar-with-dependencies.jar consensusBN.Experiments.ExperimentsJournal [index] [params_file]Or just run without arguments to execute a quick local test (by default it will run alarm with 10 clients and output parameters):
java -cp target/bnfusion-1.0-jar-with-dependencies.jar consensusBN.Experiments.ExperimentsJournalThe program reads a text file line by line using the index (0-indexed) to select the specific run's configuration. Each line should be space-separated with the following parameters:
net: Name of the network (e.g.,alarm.0)nClients: Number of clients/input DAGs (e.g.,10)popSize: Population size for the GA (e.g.,100)nIterations: Number of generations for the GA (e.g.,100)twLimit: Minimum Treewidth limit setting (e.g.,2)seed: Random seed for reproducibility (e.g.,0)useGES(Optional):trueto learn input DAGs using GES from data,falseto generate perturbations (default:false).optimizeSMHD(Optional):trueto use SMHD fitness,falseto use FSim fitness (default:true).optimizeAgainstOriginals(Optional):trueto measure against input DAGs (P2),falseto measure against G+ (P1) (default:true).useGreedyWarmstart(Optional):trueto inject greedy result into population,falsefor random init (default:true).
The results are saved as .csv files inside the results/Server/ folder, and the convergence curves for the genetic algorithms are stored in the results/Server/convergence/ subfolder.