Skip to content

FabioLiberti/BLEKFL2

Repository files navigation

Blekinge Institute of Technology Logo Universitas Mercatorum Logo

BLEKFL2 - Federated Learning Heterogeneity Explorer

Version 3.0.1

Version License Python PyTorch Flask

This framework is the result of a collaboration between the Blekinge Institute of Technology, BTH (Karlskrona, Sweden) and the University of the Italian Chambers of Commerce, Universitas Mercatorum (Rome, Italy).

BLEKFL2 is a web-based research and educational platform for the systematic exploration of statistical heterogeneity in Federated Learning environments. The platform enables controlled experiments across eight distinct heterogeneity categories, implements over 17 federated learning algorithms, and provides interactive visualizations for result analysis. It is designed for researchers and students investigating how non-IID data distributions affect the convergence, accuracy, and fairness of distributed machine learning systems.

Home screenshot

Table of Contents


Overview

Federated Learning (FL) addresses the challenge of training machine learning models across decentralized data sources while preserving data privacy. However, the performance of FL systems is significantly affected by statistical heterogeneity -- the non-uniform distribution of data across participating clients.

BLEKFL2 provides a unified experimental platform to:

  • Simulate controlled FL scenarios with configurable data heterogeneity across eight distinct categories
  • Compare 17+ FL algorithms under identical experimental conditions
  • Analyze convergence behavior, per-class accuracy, client divergence, and fairness metrics
  • Visualize results through interactive charts, distribution plots, and a 3D educational visualization
  • Generate scientific reports with comprehensive metrics including ROC curves and confusion matrices

Technology Stack

Component Technology
Backend Flask (Python 3.11), WSGI-compatible
ML Framework PyTorch (primary), TensorFlow (secondary)
FL Framework Flower (flwr 1.8.0)
Frontend HTML5, Bootstrap 5, JavaScript ES6
Visualization Chart.js, Three.js, Matplotlib, Plotly
Datasets MNIST, Fashion MNIST, CIFAR-10, CIFAR-100, SVHN

System Architecture

BLEKFL2 follows a modular architecture based on Flask Blueprints, where each heterogeneity type is implemented as an independent sub-application with its own controller, algorithms, data generators, metrics, templates, and static assets.

                        +----------------------------------+
                        |       FLASK APP (app.py)         |
                        |       Port: 5020                 |
                        +----------------------------------+
                                       |
              +------------------------+------------------------+
              |                        |                        |
     +--------v--------+    +---------v---------+    +---------v---------+
     |   Home Routes    |    |   Legacy FL API   |    |   Blueprint:       |
     |   /              |    |   /api/datasets   |    |   /statistical-    |
     |   /documentation |    |   /api/federated  |    |    heterogeneity/  |
     |   /tutorials     |    |   /api/analytics  |    |                    |
     +------------------+    +-------------------+    +---------+----------+
                                                                |
         +----------+----------+----------+----------+----------+----------+----------+
         |          |          |          |          |          |          |          |
     +---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+ +---v---+
     | Label | | Quan. | | Qual. | | Feat. | | Conc. | | Conc. | | Cata. | | Hete. |
     | Distr.| | Distr.| | Distr.| | Distr.| | Drift | | Shift | | Forg. | | Synth.|
     +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+ +-------+

Each module follows a consistent internal structure:
+-------------------------------------------+
|  Sub-Blueprint                            |
|  routes.py     | controller.py            |
|  algorithms/   | data/                    |
|  templates/    | static/                  |
|  metrics/      |                          |
+-------------------------------------------+

Design Patterns

Pattern Application
Blueprint Each heterogeneity module is an autonomous Flask Blueprint
MVC Controllers separated from routing and view logic
Strategy Interchangeable FL algorithms per heterogeneity type
Observer Metric tracking with callbacks for ROC, confusion matrices
Factory create_app() for main blueprint instantiation
Template Method Base algorithms with customizable pre/post update hooks

Heterogeneity Modules

The platform implements eight categories of statistical heterogeneity, each accessible as an independent experimental module.

Module Description Algorithms Status
Label Distribution Non-IID label distributions via Dirichlet and pathological partitioning FedAvg, FedProx, FedLaS, FedAvgM Production
Quantity Distribution Imbalanced data volume across clients FedAvg, FedProx, q-FFL Functional
Quality Distribution Varying data quality (noise, corruption) across clients FedAvg, FedProx, NoiseAwareFL, RobustAggregation Functional
Feature Distribution Covariate shift across client feature spaces FedAvg, FedProx, FedBN Functional
Conceptual Drift Temporal drift in data distributions FedAvg, FedProx, AdaptiveFL, ContinualFL, EWC Functional
Conceptual Shift Domain shift across client populations FedAvg, FedProx, MetaLearningFL, DomainGeneralizationFL, TransferFL Functional
Catastrophic Forgetting Knowledge loss during sequential task learning FedAvg, FedProx, RegularizationFL, ReplayBasedFL, ArchitecturalFL Functional
Heterogeneity Synthesis Multi-type heterogeneity analysis and cross-module comparison Analytical Functional

Additional module categories (Model Heterogeneity, Communication Heterogeneity, Hardware Heterogeneity) are planned for future releases.


Federated Learning Algorithms

Algorithm Reference

Category Algorithm Mathematical Approach Key Parameters
Baseline FedAvg (McMahan et al., 2017) Weighted parameter averaging: $w_{t+1} = \sum_k \frac{n_k}{n} w_k^t$ num_rounds, client_epochs, client_lr, client_fraction
Baseline FedProx (Li et al., 2020) Proximal regularization: $\min_w F_k(w) + \frac{\mu}{2} |w - w^t|^2$ mu (default: 0.01)
Baseline FedAvg-Optimized SGD + momentum, cosine annealing, AMP momentum (0.9), weight_decay (5e-4), early stopping
Label Skew FedLaS Class-weighted loss + knowledge distillation reg_lambda (0.01), distill_temp (2.0), distill_alpha (0.5)
Label Skew FedAvgM Server-side momentum: $v_{t+1} = \beta v_t + \Delta w$ momentum (0.9), server_lr (1.0)
Quantity q-FFL (Li et al., 2020) Fairness-aware weighting: $w_i = \ell_i^q$ q (1.0)
Quality NoiseAwareFL Client filtering by data quality score quality_threshold (0.5), credibility_decay (0.9)
Quality RobustAggregation Outlier-robust aggregation --
Feature FedBN (Li et al., 2021) Local batch normalization, shared remaining parameters Per-client BN layers
Drift AdaptiveFL Drift detection + adaptive learning rate adjustment drift_threshold (0.1), window_size (10)
Drift ContinualFL Continual learning with task boundaries --
Drift EWC (Kirkpatrick et al., 2017) Fisher information regularization lambda_ewc, gamma (0.95)
Shift MetaLearningFL MAML-based inner/outer optimization loop meta_lr (0.01), inner_lr (0.1), inner_steps (5)
Shift DomainGeneralizationFL Domain-invariant feature representations feature_dim (128), invariance_penalty (1.0)
Shift TransferFL Transfer learning across domains --
Forgetting RegularizationFL SI/MAS/L2 regularization methods Selectable method
Forgetting ReplayBasedFL Experience replay + generative replay memory_size (2000), replay_batch (32)
Forgetting ArchitecturalFL Progressive networks / PackNet / Piggyback Expansion method

Core Aggregation: FedAvg

The global model update is computed as:

$$w_{t+1} = \sum_{k=1}^{K} \frac{n_k}{n} w_k^t$$

where $w_k^t$ denotes the parameters of client $k$ after local training at round $t$, $n_k$ is the local dataset size, and $n = \sum_k n_k$.

Proximal Regularization: FedProx

Each client solves the following local objective:

$$\min_{w} F_k(w) + \frac{\mu}{2} |w - w^t|^2$$

where $F_k(w)$ is the local empirical loss and $\mu$ controls the strength of the proximal constraint to the global model $w^t$.


3D Federated Learning Visualization

The platform includes an interactive 3D visualization of the Federated Learning process, implemented with Three.js (r128) and TWEEN.js (18.6.4). This component serves as an educational tool for understanding FL communication rounds.

Route: /federated-learning-3d

Scene Composition

The 3D scene consists of a central server node and four client devices (Smartphone, Laptop, Hospital, Smart Home), each rendered with local neural network structures (input/hidden/output layers). Communication channels between server and clients are visualized with animated particle flows.

Interactive Tutorial (5 Phases)

Phase Title 3D Animation
1. Initialization What is Federated Learning? Static overview of the FL topology
2. Distribution Global Model Distribution Particles flowing from server to clients
3. Local Training Local Model Training Neural network node activations on clients
4. Aggregation Model Aggregation Particles flowing from clients to server, server pulsation
5. Evaluation Model Evaluation Server flash, metrics update

Metrics Dashboard (4 Tabs)

  • Model Architecture: Global Accuracy, Communication Rounds, Participating Clients, Model Parameters
  • Weight Updates: Average Update Size, Compression Rate, Weight Divergence
  • Client Training: Local Epochs, Batch Size, Learning Rate, Client Resources
  • Aggregation: Algorithm Selection, Aggregation Time, Privacy Protection, Noise Level

Note: The 3D visualization currently operates as a standalone educational tool with simulated metrics. Integration with live experiment data is planned for a future release.


Datasets

Dataset Dimensions Classes Train/Test Split Input Shape
MNIST 70,000 grayscale images 10 (digits 0-9) 60,000 / 10,000 28 x 28 x 1
Fashion MNIST 70,000 grayscale images 10 (clothing items) 60,000 / 10,000 28 x 28 x 1
CIFAR-10 60,000 color images 10 (object categories) 50,000 / 10,000 32 x 32 x 3
CIFAR-100 60,000 color images 100 (fine-grained categories) 50,000 / 10,000 32 x 32 x 3
SVHN 99,289 color images 10 (house numbers) 73,257 / 26,032 32 x 32 x 3

Datasets are downloaded automatically at first use via torchvision.datasets.

Data Partitioning Strategies

IID Distribution: Data is uniformly shuffled and divided equally among clients.

Non-IID via Dirichlet Distribution: Each client receives data drawn from a Dirichlet distribution with concentration parameter $\alpha$:

$$p_k \sim \text{Dir}(\alpha)$$

  • $\alpha \to 0$: extreme heterogeneity (each client holds data from very few classes)
  • $\alpha \to \infty$: approaches IID distribution

Pathological Non-IID: Each client receives data from a fixed number of classes, following the protocol of McMahan et al. (2017).


Metrics and Evaluation

Metric Scope Description
Global Accuracy Per round Classification accuracy of the aggregated global model on the test set
Training Loss Per round Cross-entropy loss during local training
Per-class Accuracy Per round/class Accuracy disaggregated by class label
Confusion Matrix Per round Full confusion matrix for classification evaluation
ROC Curves Multi-class Receiver Operating Characteristic curves with AUC
Client Divergence Per round L2 distance between client model parameters and the global model
Feature Drift Per round Parameter differences in convolutional and linear layers
Class Performance Gap Per round Difference between best and worst per-class accuracy
Communication Cost Per round Estimated communication overhead
Training Time Per round Wall-clock time for each training round

Statistical Analysis Tools

Tool Description
ADWIN Detector Adaptive windowing for online drift detection
Page-Hinkley Test Sequential change-point detection
Temporal Evolution Tracker Tracking metric evolution across training rounds
SharedStateManager Cross-method state coordination for enhanced algorithms

Installation and Setup

Prerequisites

  • Python 3.11+ (tested), 3.7+ (minimum)
  • pip or conda
  • Git
  • Approximately 5 GB disk space (including datasets and dependencies)
  • NVIDIA GPU with CUDA (optional, for accelerated training)

Step 1: Clone the Repository

git clone https://github.com/FabioLiberti/BLEKFL2_.git
cd BLEKFL2_

Step 2: Create a Virtual Environment

# Using venv
python -m venv blekfl2_env
source blekfl2_env/bin/activate   # Linux/macOS
# blekfl2_env\Scripts\activate    # Windows

# Or using conda
conda create -n blekfl2 python=3.11
conda activate blekfl2

Step 3: Install Dependencies

# Core dependencies
pip install flask werkzeug jinja2
pip install torch torchvision scikit-learn
pip install numpy scipy pandas
pip install matplotlib plotly seaborn
pip install flwr==1.8.0 flwr-datasets==0.1.0
pip install python-dotenv pyyaml pillow imbalanced-learn

# Optional
pip install tensorflow          # Secondary ML framework
pip install celery flower       # Async task queue
pip install pytest              # Testing

Step 4: Launch the Application

python app.py

The application will be available at http://localhost:5020.

Step 5: Navigate the Interface

URL Description
/ Home page with 3D visualization
/statistical-heterogeneity/ Statistical heterogeneity hub
/statistical-heterogeneity/label-distribution/ Label skew experiments
/statistical-heterogeneity/quantity-distribution/ Quantity skew experiments
/statistical-heterogeneity/quality-distribution/ Quality skew experiments
/statistical-heterogeneity/feature-distribution/ Feature skew experiments
/statistical-heterogeneity/conceptual-drift/ Concept drift experiments
/statistical-heterogeneity/conceptual-shift/ Concept shift experiments
/statistical-heterogeneity/catastrophic-forgetting/ Catastrophic forgetting experiments
/statistical-heterogeneity/heterogeneity-synthesis/ Multi-heterogeneity synthesis
/federated-learning-3d 3D FL visualization
/documentation Framework documentation
/tutorials Tutorials
/research-papers Related research papers

Production Deployment

export ARUBA_SERVER=1
python wsgi.py

In production mode, the application uses absolute paths and robust error handling for hosted environments.


Project Structure

BLEKFL2/
+-- app.py                         # Main Flask application entry point
+-- wsgi.py                        # WSGI entry point for production
+-- requirements.txt               # Python dependencies
+-- core/                          # Core modules
|   +-- federated_learning.py      # Core FL training loop and aggregation
|   +-- dataset_manager.py         # Dataset loading, partitioning, distribution
|   +-- advanced_analytics.py      # Advanced analytics and centralized baseline
|   +-- federated_simulator.py     # FL simulation utilities
|   +-- app_extensions.py          # API extensions registration
|   +-- research_papers.py         # Research papers metadata
|   +-- web_batch_testing.py       # Batch testing interface
+-- statistical_heterogeneity/     # Heterogeneity modules (8 sub-blueprints)
|   +-- common/                    # Shared utilities and base algorithms
|   |   +-- algorithms/            # FedAvg, FedProx, FedAvg-Optimized, Enhanced variants
|   |   +-- models/                # SimpleCNN and shared model architectures
|   |   +-- orchestration/         # ML and Mitigation Orchestrators
|   |   +-- state/                 # SharedStateManager
|   |   +-- statistical/           # ADWIN, Page-Hinkley, FL Statistical Framework
|   |   +-- tracking/              # Temporal Evolution Tracker
|   |   +-- validation/            # Empirical Validation Framework
|   +-- label_distribution/        # Label skew module (most complete)
|   +-- quantity_distribution/     # Quantity skew module
|   +-- quality_distribution/      # Quality skew module
|   +-- feature_distribution/      # Feature skew module
|   +-- conceptual_drift/          # Concept drift module
|   +-- conceptual_shift/          # Concept shift module
|   +-- catastrophic_forgetting/   # Catastrophic forgetting module
|   +-- heterogeneity_synthesis/   # Multi-heterogeneity synthesis
+-- templates/                     # Root-level HTML templates
+-- static/                        # Static assets (CSS, JS, images)
|   +-- js/fl_visualization_3d.js  # Three.js 3D visualization engine
|   +-- css/fl_visualization_3d.css
+-- tests/                         # Test suites
+-- benchmarks/                    # Benchmark scripts
+-- tools/                         # Monitoring and utility scripts
+-- docs/                          # Additional documentation
+-- Documentazione/                # Research documentation and papers

API Reference

The platform exposes over 100 REST API endpoints. The main categories are summarized below.

Root Application Endpoints

Method Endpoint Description
GET / Home page
GET /statistical-heterogeneity/ Heterogeneity hub
GET /federated-learning-3d 3D visualization
GET /documentation Documentation
GET /research-papers Research papers

Legacy FL Simulation API

Method Endpoint Description
GET /api/datasets List available datasets
POST /api/dataset/load Load a full dataset
POST /api/dataset/subset Load a dataset subset
POST /api/federated/initialize Start FL simulation
GET /api/federated/status Get simulation progress
GET /api/federated/results Get simulation results
POST /api/federated/stop Stop running simulation
GET /api/analytics/advanced Advanced analytics results
GET/POST /api/analytics/centralized Centralized training baseline

Per-Module API Pattern

Each heterogeneity module exposes a consistent set of endpoints:

Method Endpoint Pattern Description
GET /api/experiments List experiments
POST /api/run-experiment Run a new experiment
GET /api/experiments/<id> Experiment details and results
GET /api/experiments/<id>/visualizations Generated visualizations
GET /api/experiments/<id>/report Downloadable PDF report
POST /api/compare-experiments Multi-experiment comparison
POST /api/visualize-distribution Generate distribution visualization

The Label Distribution module additionally provides Alpha Benchmark endpoints for systematic evaluation across Dirichlet concentration parameters.


Contributing

Contributions are welcome. Potential areas include:

  1. Implementation of additional FL algorithms
  2. Support for new datasets
  3. Expansion of planned modules (Model, Communication, Hardware Heterogeneity)
  4. Enhancement of visualization components
  5. Documentation improvements

Contribution Guidelines

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/new-algorithm
  3. Commit your changes: git commit -m 'Add new federated algorithm XYZ'
  4. Push to the branch: git push origin feature/new-algorithm
  5. Submit a pull request

Publications and Citations

If you use BLEKFL2 in your research, please cite:

@software{liberti2024blekfl2,
  title={BLEKFL2: A Platform for Studying Statistical Heterogeneity in Federated Learning},
  author={Liberti, Fabio},
  year={2024},
  url={https://github.com/FabioLiberti/BLEKFL2_}
}

Key References

  1. McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS).

  2. Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems (MLSys).

  3. Li, X., Jiang, M., Zhang, X., Kamp, M., & Dou, Q. (2021). FedBN: Federated learning on non-IID features via local batch normalization. Proceedings of the 9th International Conference on Learning Representations (ICLR).

  4. Kirkpatrick, J., Pascanu, R., Rabinowitz, N., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13), 3521-3526.

  5. Kairouz, P., McMahan, H. B., et al. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1-2), 1-210.


License

This project is licensed under the MIT License. See the LICENSE file for details.


Acknowledgments

  • This research was supported by Blekinge Institute of Technology (BTH), Karlskrona, Sweden, and Universitas Mercatorum, Rome, Italy.
  • The implementation relies on PyTorch, Flask, Flower, Three.js, and Chart.js.
  • We acknowledge the broader federated learning research community whose work informed the algorithm implementations in this platform.

Contact


A collaborative research platform by BTH (Karlskrona, Sweden) and Universitas Mercatorum (Rome, Italy)

About

This framework is the result of a collaboration between the Blekinge Institute of Technology, BTH (Karlskrona, Sweden) and the University of the Italian Chambers of Commerce, Universitas Mercatorum (Rome, Italy).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors