|
|
Version 3.0.1
This framework is the result of a collaboration between the Blekinge Institute of Technology, BTH (Karlskrona, Sweden) and the University of the Italian Chambers of Commerce, Universitas Mercatorum (Rome, Italy).
BLEKFL2 is a web-based research and educational platform for the systematic exploration of statistical heterogeneity in Federated Learning environments. The platform enables controlled experiments across eight distinct heterogeneity categories, implements over 17 federated learning algorithms, and provides interactive visualizations for result analysis. It is designed for researchers and students investigating how non-IID data distributions affect the convergence, accuracy, and fairness of distributed machine learning systems.
- Overview
- System Architecture
- Heterogeneity Modules
- Federated Learning Algorithms
- 3D Federated Learning Visualization
- Datasets
- Metrics and Evaluation
- Installation and Setup
- Project Structure
- API Reference
- Contributing
- Publications and Citations
- License
- Acknowledgments
- Contact
Federated Learning (FL) addresses the challenge of training machine learning models across decentralized data sources while preserving data privacy. However, the performance of FL systems is significantly affected by statistical heterogeneity -- the non-uniform distribution of data across participating clients.
BLEKFL2 provides a unified experimental platform to:
- Simulate controlled FL scenarios with configurable data heterogeneity across eight distinct categories
- Compare 17+ FL algorithms under identical experimental conditions
- Analyze convergence behavior, per-class accuracy, client divergence, and fairness metrics
- Visualize results through interactive charts, distribution plots, and a 3D educational visualization
- Generate scientific reports with comprehensive metrics including ROC curves and confusion matrices
| Component | Technology |
|---|---|
| Backend | Flask (Python 3.11), WSGI-compatible |
| ML Framework | PyTorch (primary), TensorFlow (secondary) |
| FL Framework | Flower (flwr 1.8.0) |
| Frontend | HTML5, Bootstrap 5, JavaScript ES6 |
| Visualization | Chart.js, Three.js, Matplotlib, Plotly |
| Datasets | MNIST, Fashion MNIST, CIFAR-10, CIFAR-100, SVHN |
BLEKFL2 follows a modular architecture based on Flask Blueprints, where each heterogeneity type is implemented as an independent sub-application with its own controller, algorithms, data generators, metrics, templates, and static assets.
+----------------------------------+
| FLASK APP (app.py) |
| Port: 5020 |
+----------------------------------+
|
+------------------------+------------------------+
| | |
+--------v--------+ +---------v---------+ +---------v---------+
| Home Routes | | Legacy FL API | | Blueprint: |
| / | | /api/datasets | | /statistical- |
| /documentation | | /api/federated | | heterogeneity/ |
| /tutorials | | /api/analytics | | |
+------------------+ +-------------------+ +---------+----------+
|
+----------+----------+----------+----------+----------+----------+----------+
| | | | | | | |
+---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+
| Label | | Quan. | | Qual. | | Feat. | | Conc. | | Conc. | | Cata. | | Hete. |
| Distr.| | Distr.| | Distr.| | Distr.| | Drift | | Shift | | Forg. | | Synth.|
+-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+
Each module follows a consistent internal structure:
+-------------------------------------------+
| Sub-Blueprint |
| routes.py | controller.py |
| algorithms/ | data/ |
| templates/ | static/ |
| metrics/ | |
+-------------------------------------------+
| Pattern | Application |
|---|---|
| Blueprint | Each heterogeneity module is an autonomous Flask Blueprint |
| MVC | Controllers separated from routing and view logic |
| Strategy | Interchangeable FL algorithms per heterogeneity type |
| Observer | Metric tracking with callbacks for ROC, confusion matrices |
| Factory | create_app() for main blueprint instantiation |
| Template Method | Base algorithms with customizable pre/post update hooks |
The platform implements eight categories of statistical heterogeneity, each accessible as an independent experimental module.
| Module | Description | Algorithms | Status |
|---|---|---|---|
| Label Distribution | Non-IID label distributions via Dirichlet and pathological partitioning | FedAvg, FedProx, FedLaS, FedAvgM | Production |
| Quantity Distribution | Imbalanced data volume across clients | FedAvg, FedProx, q-FFL | Functional |
| Quality Distribution | Varying data quality (noise, corruption) across clients | FedAvg, FedProx, NoiseAwareFL, RobustAggregation | Functional |
| Feature Distribution | Covariate shift across client feature spaces | FedAvg, FedProx, FedBN | Functional |
| Conceptual Drift | Temporal drift in data distributions | FedAvg, FedProx, AdaptiveFL, ContinualFL, EWC | Functional |
| Conceptual Shift | Domain shift across client populations | FedAvg, FedProx, MetaLearningFL, DomainGeneralizationFL, TransferFL | Functional |
| Catastrophic Forgetting | Knowledge loss during sequential task learning | FedAvg, FedProx, RegularizationFL, ReplayBasedFL, ArchitecturalFL | Functional |
| Heterogeneity Synthesis | Multi-type heterogeneity analysis and cross-module comparison | Analytical | Functional |
Additional module categories (Model Heterogeneity, Communication Heterogeneity, Hardware Heterogeneity) are planned for future releases.
| Category | Algorithm | Mathematical Approach | Key Parameters |
|---|---|---|---|
| Baseline | FedAvg (McMahan et al., 2017) | Weighted parameter averaging: |
num_rounds, client_epochs, client_lr, client_fraction
|
| Baseline | FedProx (Li et al., 2020) | Proximal regularization: |
mu (default: 0.01) |
| Baseline | FedAvg-Optimized | SGD + momentum, cosine annealing, AMP |
momentum (0.9), weight_decay (5e-4), early stopping |
| Label Skew | FedLaS | Class-weighted loss + knowledge distillation |
reg_lambda (0.01), distill_temp (2.0), distill_alpha (0.5) |
| Label Skew | FedAvgM | Server-side momentum: |
momentum (0.9), server_lr (1.0) |
| Quantity | q-FFL (Li et al., 2020) | Fairness-aware weighting: |
q (1.0) |
| Quality | NoiseAwareFL | Client filtering by data quality score |
quality_threshold (0.5), credibility_decay (0.9) |
| Quality | RobustAggregation | Outlier-robust aggregation | -- |
| Feature | FedBN (Li et al., 2021) | Local batch normalization, shared remaining parameters | Per-client BN layers |
| Drift | AdaptiveFL | Drift detection + adaptive learning rate adjustment |
drift_threshold (0.1), window_size (10) |
| Drift | ContinualFL | Continual learning with task boundaries | -- |
| Drift | EWC (Kirkpatrick et al., 2017) | Fisher information regularization |
lambda_ewc, gamma (0.95) |
| Shift | MetaLearningFL | MAML-based inner/outer optimization loop |
meta_lr (0.01), inner_lr (0.1), inner_steps (5) |
| Shift | DomainGeneralizationFL | Domain-invariant feature representations |
feature_dim (128), invariance_penalty (1.0) |
| Shift | TransferFL | Transfer learning across domains | -- |
| Forgetting | RegularizationFL | SI/MAS/L2 regularization methods | Selectable method |
| Forgetting | ReplayBasedFL | Experience replay + generative replay |
memory_size (2000), replay_batch (32) |
| Forgetting | ArchitecturalFL | Progressive networks / PackNet / Piggyback | Expansion method |
The global model update is computed as:
where
Each client solves the following local objective:
where
The platform includes an interactive 3D visualization of the Federated Learning process, implemented with Three.js (r128) and TWEEN.js (18.6.4). This component serves as an educational tool for understanding FL communication rounds.
Route: /federated-learning-3d
The 3D scene consists of a central server node and four client devices (Smartphone, Laptop, Hospital, Smart Home), each rendered with local neural network structures (input/hidden/output layers). Communication channels between server and clients are visualized with animated particle flows.
| Phase | Title | 3D Animation |
|---|---|---|
| 1. Initialization | What is Federated Learning? | Static overview of the FL topology |
| 2. Distribution | Global Model Distribution | Particles flowing from server to clients |
| 3. Local Training | Local Model Training | Neural network node activations on clients |
| 4. Aggregation | Model Aggregation | Particles flowing from clients to server, server pulsation |
| 5. Evaluation | Model Evaluation | Server flash, metrics update |
- Model Architecture: Global Accuracy, Communication Rounds, Participating Clients, Model Parameters
- Weight Updates: Average Update Size, Compression Rate, Weight Divergence
- Client Training: Local Epochs, Batch Size, Learning Rate, Client Resources
- Aggregation: Algorithm Selection, Aggregation Time, Privacy Protection, Noise Level
Note: The 3D visualization currently operates as a standalone educational tool with simulated metrics. Integration with live experiment data is planned for a future release.
| Dataset | Dimensions | Classes | Train/Test Split | Input Shape |
|---|---|---|---|---|
| MNIST | 70,000 grayscale images | 10 (digits 0-9) | 60,000 / 10,000 | 28 x 28 x 1 |
| Fashion MNIST | 70,000 grayscale images | 10 (clothing items) | 60,000 / 10,000 | 28 x 28 x 1 |
| CIFAR-10 | 60,000 color images | 10 (object categories) | 50,000 / 10,000 | 32 x 32 x 3 |
| CIFAR-100 | 60,000 color images | 100 (fine-grained categories) | 50,000 / 10,000 | 32 x 32 x 3 |
| SVHN | 99,289 color images | 10 (house numbers) | 73,257 / 26,032 | 32 x 32 x 3 |
Datasets are downloaded automatically at first use via torchvision.datasets.
IID Distribution: Data is uniformly shuffled and divided equally among clients.
Non-IID via Dirichlet Distribution: Each client receives data drawn from a Dirichlet distribution with concentration parameter
-
$\alpha \to 0$ : extreme heterogeneity (each client holds data from very few classes) -
$\alpha \to \infty$ : approaches IID distribution
Pathological Non-IID: Each client receives data from a fixed number of classes, following the protocol of McMahan et al. (2017).
| Metric | Scope | Description |
|---|---|---|
| Global Accuracy | Per round | Classification accuracy of the aggregated global model on the test set |
| Training Loss | Per round | Cross-entropy loss during local training |
| Per-class Accuracy | Per round/class | Accuracy disaggregated by class label |
| Confusion Matrix | Per round | Full confusion matrix for classification evaluation |
| ROC Curves | Multi-class | Receiver Operating Characteristic curves with AUC |
| Client Divergence | Per round | L2 distance between client model parameters and the global model |
| Feature Drift | Per round | Parameter differences in convolutional and linear layers |
| Class Performance Gap | Per round | Difference between best and worst per-class accuracy |
| Communication Cost | Per round | Estimated communication overhead |
| Training Time | Per round | Wall-clock time for each training round |
| Tool | Description |
|---|---|
| ADWIN Detector | Adaptive windowing for online drift detection |
| Page-Hinkley Test | Sequential change-point detection |
| Temporal Evolution Tracker | Tracking metric evolution across training rounds |
| SharedStateManager | Cross-method state coordination for enhanced algorithms |
- Python 3.11+ (tested), 3.7+ (minimum)
- pip or conda
- Git
- Approximately 5 GB disk space (including datasets and dependencies)
- NVIDIA GPU with CUDA (optional, for accelerated training)
git clone https://github.com/FabioLiberti/BLEKFL2_.git
cd BLEKFL2_# Using venv
python -m venv blekfl2_env
source blekfl2_env/bin/activate # Linux/macOS
# blekfl2_env\Scripts\activate # Windows
# Or using conda
conda create -n blekfl2 python=3.11
conda activate blekfl2# Core dependencies
pip install flask werkzeug jinja2
pip install torch torchvision scikit-learn
pip install numpy scipy pandas
pip install matplotlib plotly seaborn
pip install flwr==1.8.0 flwr-datasets==0.1.0
pip install python-dotenv pyyaml pillow imbalanced-learn
# Optional
pip install tensorflow # Secondary ML framework
pip install celery flower # Async task queue
pip install pytest # Testingpython app.pyThe application will be available at http://localhost:5020.
| URL | Description |
|---|---|
/ |
Home page with 3D visualization |
/statistical-heterogeneity/ |
Statistical heterogeneity hub |
/statistical-heterogeneity/label-distribution/ |
Label skew experiments |
/statistical-heterogeneity/quantity-distribution/ |
Quantity skew experiments |
/statistical-heterogeneity/quality-distribution/ |
Quality skew experiments |
/statistical-heterogeneity/feature-distribution/ |
Feature skew experiments |
/statistical-heterogeneity/conceptual-drift/ |
Concept drift experiments |
/statistical-heterogeneity/conceptual-shift/ |
Concept shift experiments |
/statistical-heterogeneity/catastrophic-forgetting/ |
Catastrophic forgetting experiments |
/statistical-heterogeneity/heterogeneity-synthesis/ |
Multi-heterogeneity synthesis |
/federated-learning-3d |
3D FL visualization |
/documentation |
Framework documentation |
/tutorials |
Tutorials |
/research-papers |
Related research papers |
export ARUBA_SERVER=1
python wsgi.pyIn production mode, the application uses absolute paths and robust error handling for hosted environments.
BLEKFL2/
+-- app.py # Main Flask application entry point
+-- wsgi.py # WSGI entry point for production
+-- requirements.txt # Python dependencies
+-- core/ # Core modules
| +-- federated_learning.py # Core FL training loop and aggregation
| +-- dataset_manager.py # Dataset loading, partitioning, distribution
| +-- advanced_analytics.py # Advanced analytics and centralized baseline
| +-- federated_simulator.py # FL simulation utilities
| +-- app_extensions.py # API extensions registration
| +-- research_papers.py # Research papers metadata
| +-- web_batch_testing.py # Batch testing interface
+-- statistical_heterogeneity/ # Heterogeneity modules (8 sub-blueprints)
| +-- common/ # Shared utilities and base algorithms
| | +-- algorithms/ # FedAvg, FedProx, FedAvg-Optimized, Enhanced variants
| | +-- models/ # SimpleCNN and shared model architectures
| | +-- orchestration/ # ML and Mitigation Orchestrators
| | +-- state/ # SharedStateManager
| | +-- statistical/ # ADWIN, Page-Hinkley, FL Statistical Framework
| | +-- tracking/ # Temporal Evolution Tracker
| | +-- validation/ # Empirical Validation Framework
| +-- label_distribution/ # Label skew module (most complete)
| +-- quantity_distribution/ # Quantity skew module
| +-- quality_distribution/ # Quality skew module
| +-- feature_distribution/ # Feature skew module
| +-- conceptual_drift/ # Concept drift module
| +-- conceptual_shift/ # Concept shift module
| +-- catastrophic_forgetting/ # Catastrophic forgetting module
| +-- heterogeneity_synthesis/ # Multi-heterogeneity synthesis
+-- templates/ # Root-level HTML templates
+-- static/ # Static assets (CSS, JS, images)
| +-- js/fl_visualization_3d.js # Three.js 3D visualization engine
| +-- css/fl_visualization_3d.css
+-- tests/ # Test suites
+-- benchmarks/ # Benchmark scripts
+-- tools/ # Monitoring and utility scripts
+-- docs/ # Additional documentation
+-- Documentazione/ # Research documentation and papers
The platform exposes over 100 REST API endpoints. The main categories are summarized below.
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Home page |
| GET | /statistical-heterogeneity/ |
Heterogeneity hub |
| GET | /federated-learning-3d |
3D visualization |
| GET | /documentation |
Documentation |
| GET | /research-papers |
Research papers |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/datasets |
List available datasets |
| POST | /api/dataset/load |
Load a full dataset |
| POST | /api/dataset/subset |
Load a dataset subset |
| POST | /api/federated/initialize |
Start FL simulation |
| GET | /api/federated/status |
Get simulation progress |
| GET | /api/federated/results |
Get simulation results |
| POST | /api/federated/stop |
Stop running simulation |
| GET | /api/analytics/advanced |
Advanced analytics results |
| GET/POST | /api/analytics/centralized |
Centralized training baseline |
Each heterogeneity module exposes a consistent set of endpoints:
| Method | Endpoint Pattern | Description |
|---|---|---|
| GET | /api/experiments |
List experiments |
| POST | /api/run-experiment |
Run a new experiment |
| GET | /api/experiments/<id> |
Experiment details and results |
| GET | /api/experiments/<id>/visualizations |
Generated visualizations |
| GET | /api/experiments/<id>/report |
Downloadable PDF report |
| POST | /api/compare-experiments |
Multi-experiment comparison |
| POST | /api/visualize-distribution |
Generate distribution visualization |
The Label Distribution module additionally provides Alpha Benchmark endpoints for systematic evaluation across Dirichlet concentration parameters.
Contributions are welcome. Potential areas include:
- Implementation of additional FL algorithms
- Support for new datasets
- Expansion of planned modules (Model, Communication, Hardware Heterogeneity)
- Enhancement of visualization components
- Documentation improvements
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-algorithm - Commit your changes:
git commit -m 'Add new federated algorithm XYZ' - Push to the branch:
git push origin feature/new-algorithm - Submit a pull request
If you use BLEKFL2 in your research, please cite:
@software{liberti2024blekfl2,
title={BLEKFL2: A Platform for Studying Statistical Heterogeneity in Federated Learning},
author={Liberti, Fabio},
year={2024},
url={https://github.com/FabioLiberti/BLEKFL2_}
}-
McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS).
-
Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems (MLSys).
-
Li, X., Jiang, M., Zhang, X., Kamp, M., & Dou, Q. (2021). FedBN: Federated learning on non-IID features via local batch normalization. Proceedings of the 9th International Conference on Learning Representations (ICLR).
-
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13), 3521-3526.
-
Kairouz, P., McMahan, H. B., et al. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1-2), 1-210.
This project is licensed under the MIT License. See the LICENSE file for details.
- This research was supported by Blekinge Institute of Technology (BTH), Karlskrona, Sweden, and Universitas Mercatorum, Rome, Italy.
- The implementation relies on PyTorch, Flask, Flower, Three.js, and Chart.js.
- We acknowledge the broader federated learning research community whose work informed the algorithm implementations in this platform.
- Fabio Liberti -- Project Lead -- fabioliberti.fl@gmail.com
- Repository: https://github.com/FabioLiberti/BLEKFL2_
A collaborative research platform by BTH (Karlskrona, Sweden) and Universitas Mercatorum (Rome, Italy)


