A research-grade sandbox for studying the two-machine load-balancing problem with both quantum approximate optimization (QAOA) and classical baselines. The codebase now follows a reproducible workflow: deterministic dataset generation, configurable QUBO construction, experiment orchestration via YAML, and automated table/figure synthesis for conference submissions.
├── configs/ # YAML experiment sweeps (e.g., baseline.yaml)
├── data/datasets/ # CSV inputs regenerated via scripts
├── docs/report.md # Manuscript-style project report
├── results/ # JSON artifacts + CSV summaries (gitignored)
├── scripts/
│ ├── generate_datasets.py # Deterministic synthetic workloads
│ ├── run_experiments.py # Single runs or config-driven sweeps
│ ├── analyze_results.py # Markdown / LaTeX tables for publications
│ └── prove_advantage.py # Turnkey script to showcase QAOA benefits
└── src/quantum_scheduler/ # Package with QUBO builder, solvers, utilities
-
Environment
python -m venv .venv source .venv/bin/activate pip install -U pip pip install -e .
-
Datasets (deterministic seeds)
python scripts/generate_datasets.py --sizes 10 20 25 --seed 123
This repopulates
data/datasets/with CSVs keyed by job count. Larger datasets (≥20 tasks) use ILP classical solver since brute force becomes too slow. -
Single experiment
python scripts/run_experiments.py \ --dataset dataset_10.csv --reps 2 --maxiter 80 \ --balance-penalty 10.0 --makespan-penalty 30.0 --priority-bias 0.1 --tag demo
Produces
results/dataset_10_r2_shots4096_demo.jsoncontaining QAOA energies, decoded schedules, and classical baselines.Important QUBO Parameters:
--balance-penalty: Weight for load imbalance penalty (default: 10.0)--makespan-penalty: Weight for direct makespan penalty (default: 2× balance_penalty)- Higher values (30-50) better align QUBO with makespan minimization
--priority-bias: Weight for priority term (default: 0.1)- Set to 0.0 to focus purely on makespan minimization
-
Batch sweep (recommended for papers)
python scripts/run_experiments.py --config configs/baseline.yaml
Generates per-run JSON files and an aggregated CSV (
results/baseline_summary.csv). -
Analysis & Tables
python scripts/analyze_results.py --input-csv results/baseline_summary.csv
Emits
analysis/summary.md+analysis/summary.texready for inclusion in the manuscript. -
Demonstrate QAOA advantages
python scripts/prove_advantage.py
Runs three preconfigured experiments (quality trap, search-efficiency, diversity) that contrast QAOA with greedy and brute-force classical methods.
-
Generate diagnostic plots
python scripts/plot_results.py --results-dir results --plots-dir results/plots
Produces a full visualization suite per dataset:
- QUBO landscape line plot (energy spread of sampled bitstrings)
- Optimizer convergence curves (energy vs. evaluation)
- Makespan comparison bar chart (QAOA vs. classical baselines)
- Sample distribution histogram (top measured bitstrings)
- Heatmap of QUBO coefficients (structure of the quadratic form)
-
Render the QAOA circuit diagram
python scripts/render_circuit_diagram.py --dataset dataset_10.csv --reps 2 --measure
Saves a publication-ready circuit figure to
results/circuit_diagrams/dataset_10_p2.png(use--drawer textfor ASCII output). -
Run the tests
pytest
Exercises the plotting helpers to ensure regressions are caught early.
quantum_scheduler.utils.build_qubo: Constructs a dense QUBO that penalizes load imbalance and rewards high-priority jobs on the designated primary machine.quantum_scheduler.qaoa_solver.solve_qaoa_local: Runs AerSimulator-based QAOA, optimizes parameters via COBYLA, samples the final circuit, and decodes the most promising bitstring into a two-machine schedule with makespan metrics.quantum_scheduler.classical_solver.solve_classical: Exact brute-force baseline (sufficient for ≤12 jobs) returning assignments, machine loads, and optimal makespans.quantum_scheduler.classical_solver.solve_greedy: Lightweight heuristic assigning each task to the currently lightest machine (used to showcase QAOA’s quality advantage).scripts/run_experiments.py: CLI with support for config sweeps, per-run metadata, and relative performance gap computation.scripts/analyze_results.py: Converts JSON/CSV artifacts into publication-ready Markdown/LaTeX tables and prints summary statistics.scripts/prove_advantage.py: Reproducible storytelling harness that runs the quality/efficiency/diversity tests contrasting QAOA with greedy/brute-force methods.
docs/report.md– Abstract, problem statement, methodology, and reproducibility checklist.configs/baseline.yaml– Default sweep referenced in the report.analysis/summary.md|tex– Auto-generated once analysis script is executed.
For confidence intervals and statistical tests:
# Run multiple independent experiments
python scripts/run_experiments.py --config configs/statistical_baseline.yaml
# Generate statistical analysis report
python scripts/statistical_analysis.py --results-dir resultsThis will produce:
- Multiple independent runs per configuration (default: 10 runs)
- Statistical summaries with mean, std, confidence intervals
- Approximation ratio analysis
- Normality tests and descriptive statistics
# Validate QUBO formulation (check energy-makespan correlation)
python scripts/validate_qubo.py --dataset data/datasets/dataset_10.csv
# Test with different penalty weights
python scripts/validate_qubo.py --dataset data/datasets/dataset_10.csv \
--balance-penalty 10.0 --makespan-penalty 30.0
# Target: correlation > 0.7 (strong positive correlation)If correlation is low (< 0.5), the QUBO is not properly encoding makespan minimization. See docs/CRITICAL_ISSUE_QUBO.md for details.
Show QAOA behavior across different instance types (uniform, heavy-tailed, clustered):
# Generate instance families (n=20)
python scripts/generate_instance_families.py --size 20 --num-instances 3
# Run experiments
python scripts/run_experiments.py --config configs/instance_families.yaml
# Analyze results
python scripts/analyze_instance_families.py --input-csv results/instance_families_summary.csvQuantify QUBO energy-makespan correlation across problem sizes:
python scripts/analyze_correlation_scaling.py --sizes 10 20 25 --num-samples 2000Outputs: Pearson/Spearman correlations, top-k overlap metrics.
Test impact of QUBO components (penalty weights, balance terms, circuit depth):
# Run ablation experiments
python scripts/run_experiments.py --config configs/ablation_study.yaml
# Analyze results
python scripts/analyze_ablation.py --input-csv results/ablation_study_summary.csvThe framework now includes three classical baseline methods:
-
Brute Force (optimal for
$n \leq 12$ ): Exhaustive enumeration - ILP Solver (PuLP-based): Integer linear programming formulation
- Greedy Heuristic: Fast greedy assignment
Choose solver via --classical-method flag:
python scripts/run_experiments.py --dataset dataset_10.csv --classical-method ilp- The current QUBO/decoder targets the two-machine partitioning variant; extending to
M>2is a planned upgrade. - ILP solver available via PuLP for larger instances
- Statistical analysis tools for publication-quality results
- For hardware validation, swap
AerSimulatorwith provider-specific backends or noise models insidesolve_qaoa_local.
With these additions, the repository satisfies the usual reproducibility requirements for quantum optimization workshops: sealed dependencies, deterministic data, declarative experiment configs, and scripted result synthesis.
x: vector of binary variables ((x_1, x_2, \ldots, x_n))
- in our case this is the flattened vector of (x_{i,m}) job–machine assignments
Q: symmetric matrix of quadratic coefficients
- captures pairwise interactions between binary variables
- e.g., (Q_{i,j} x_i x_j) could mean “if two jobs are on the same machine, add a penalty”
c: vector of linear coefficients
- captures individual variable contributions (cost or bias for each assignment)
E(x): total energy / cost to minimize
We want to assign a set of jobs:
to a set of identical machines:
so that:
- every job goes to exactly one machine
- all machines have similar total workloads
- optionally, total finish time (makespan) is small
We create:
These are the only unknowns the solver will decide.
If each job (i) has processing time (p_i),
the load on machine (m) is:
The average load is:
Minimize how far each load is from the average:
- since (\bar{L}) is constant, subtracting it only adds a constant shift
Substituting (L_m = \sum_i p_i x_{i,m}):
- this is our first quadratic term
- we scale it with weight (\lambda_1)
Each job must go to exactly one machine:
Convert to a quadratic penalty:
- this enforces one-hot assignment
- scaled by (\lambda_2)
Total QUBO objective:
Quadratic form:
General expanded Q form:
\sum_{i,j,m,m'} Q_{(i,m),(j,m')} , x_{i,m} x_{j,m'} + \sum_{i,m} c_{i,m} x_{i,m} + \text{const} $$
Interpretation:
- each pair ((i,m)) is one binary variable
- (Q) encodes interactions
- (c) encodes linear preferences
Diagonal:
Because the variables are binary ((x^2 = x)), these diagonal entries ultimately act as the linear coefficients (c_{i,m} = \lambda_1 p_i^2) in (c^T x), even though we place them on the diagonal of (Q) for compactness.
Off-diagonal (within same machine):
- this penalizes putting too many long jobs on the same machine
Diagonal:
Again, due to (x_{i,m}^2 = x_{i,m}), this diagonal term corresponds to the linear bias (c_{i,m} = -\lambda_2).
Off-diagonal (within same job):
The full Q matrix is the sum of these effects (with the understanding that diagonal entries double as the linear coefficient vector).
Binary → spin:
Mapping:
Plugging into the energy yields the Ising Hamiltonian:
\text{constant} + \sum_i h_i z_i + \sum_{i<j} J_{i,j} z_i z_j $$
- this is the Ising Hamiltonian form
- linear coefficients (h_i) come from diagonal parts of Q
- coupling terms (J_{i,j}) come from off-diagonal Q entries
QAOA alternates between Cost and Mixer unitaries.
U_M(\beta_p) U_C(\gamma_p) \cdots U_M(\beta_1) U_C(\gamma_1) |+\rangle^{\otimes n} $$
- qubits represent binary decision variables (x_{i,m})
- initial state: (|+\rangle^{\otimes n})
- cost unitaries implement (e^{-i\gamma J_{i,j} Z_i Z_j}) and (e^{-i\gamma h_i Z_i})
- mixing unitaries are (R_X(2\beta))
- measure in Z basis to obtain assignments
\frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} |x\rangle $$
U_M(\beta) U_C(\gamma) |\psi_0\rangle $$
\left| \langle x \mid \psi(\gamma, \beta) \rangle \right|^2 $$