This project implements a comprehensive multi-user authentication system using neural networks for analyzing accelerometer-based behavioral biometrics. The system processes acceleration data from 10 different users and employs both pre-optimized and post-optimized neural network architectures to achieve robust user authentication.
- Develop a robust multi-user authentication system using accelerometer data
- Compare performance between pre-optimized and post-optimized neural network architectures
- Analyze behavioral biometric patterns across different data domains (time, frequency, and combined)
- Implement comprehensive performance evaluation metrics for authentication systems
PUSL3123-Coursework/
├── README.md # This file
├── PUSL3123-Coursework-2024.pdf # Project specification document
├── (01)-Data-Analysis/ # Initial data analysis scripts
│ ├── load_data.m # Data loading utility
│ ├── analyze_inter_variance.m # Inter-user variance analysis
│ └── analyze_intra_variance.m # Intra-user variance analysis
├── (02)-Pre-optimized-Neural-Network/ # Pre-optimization implementation
│ ├── main.m # Main execution script
│ ├── 1_data_analysis/ # Data analysis modules
│ ├── 2_neural_network/ # Neural network implementation
│ │ ├── prepare_data/ # Data preparation utilities
│ │ ├── train/ # Training modules
│ │ ├── evaluate/ # Evaluation utilities
│ │ └── results/ # Output results
│ └── CW-Data/ # Dataset (60 .mat files)
└── (03)-Post-optimized-Neural-Network/ # Post-optimization implementation
├── main.m # Main execution script
├── 1_data_analysis/ # Enhanced data analysis
├── 2_neural_network/ # Optimized neural network
│ ├── prepare_data/ # Enhanced data preparation
│ ├── train/ # Optimized training algorithms
│ ├── evaluate/ # Advanced evaluation metrics
│ └── results/ # Comparative results
└── CW-Data/ # Dataset (60 .mat files)
The dataset consists of accelerometer-based behavioral biometric data from 10 users (U01-U10). Each user has 6 different data files representing different feature extraction methods and collection scenarios:
- Time Domain (
TimeD): Raw accelerometer signals in time domain - Frequency Domain (
FreqD): Frequency-transformed features - Combined (
TimeD_FreqD): Hybrid time-frequency features
- Same Day (
FDay): Training and testing data collected on the same day - Cross Day (
MDay): Training and testing data collected on different days
U[XX]_Acc_[Domain]_[Scenario].mat
XX: User ID (01-10)Domain: TimeD, FreqD, or TimeD_FreqDScenario: FDay or MDay
Total Dataset: 60 .mat files (10 users × 6 data variations each)
- Architecture: Standard feedforward neural network
- Hidden Layers: Fixed architecture
- Activation Functions: Default MATLAB configurations
- Training Algorithm: Standard backpropagation
- Dynamic Architecture: Adaptive hidden layer sizing using multiple heuristics
- Optimized Training: Enhanced training algorithms with early stopping
- Advanced Metrics: Comprehensive authentication performance evaluation
- Regularization: Improved generalization techniques
-
Dynamic Neuron Calculation:
- Geometric pyramid rule
- Input size-based rule
- Sample size-based rule
- Median-based final selection
-
Training Enhancements:
- Monitoring and early stopping
- Adaptive learning rates
- Cross-validation
-
Performance Metrics:
- False Acceptance Rate (FAR)
- False Rejection Rate (FRR)
- Equal Error Rate (EER)
- Area Under Curve (AUC)
- F1-Score
- MATLAB R2019b or later
- Neural Network Toolbox
- Statistics and Machine Learning Toolbox
- Clone or download the project repository
- Open MATLAB and navigate to the project directory
- Ensure all required toolboxes are installed
cd '(02)-Pre-optimized-Neural-Network'
maincd '(03)-Post-optimized-Neural-Network'
maincd '(01)-Data-Analysis'
analyze_inter_variance
analyze_intra_varianceWhen running the main scripts, you'll be prompted to select:
- Time domain - Analysis using temporal features
- Frequency domain - Analysis using spectral features
- Combined - Analysis using both time and frequency features
The system evaluates authentication performance using industry-standard metrics:
- Accuracy: Overall classification accuracy
- FAR (False Acceptance Rate): Rate of incorrectly accepting impostors
- FRR (False Rejection Rate): Rate of incorrectly rejecting genuine users
- EER (Equal Error Rate): Point where FAR equals FRR
- AUC (Area Under Curve): ROC curve area measurement
- F1-Score: Harmonic mean of precision and recall
- Confusion matrices for each user
- ROC curves
- Performance comparison charts
- Training progress monitoring
- Inter-variance Analysis: Examines differences between users
- Intra-variance Analysis: Examines consistency within users
- Statistical Profiling: Comprehensive feature statistics
- Adaptive Architecture: Dynamic hidden layer sizing
- Multi-domain Support: Time, frequency, and combined features
- Cross-validation: Robust performance estimation
- Early Stopping: Prevents overfitting
- Comprehensive Metrics: Multiple performance indicators
- User-specific Analysis: Individual user performance profiling
- Comparative Analysis: Pre vs. post-optimization comparison
- Visual Reports: Automated result visualization
- Improved Authentication Accuracy: Post-optimized networks typically achieve 85-95% accuracy
- Reduced Error Rates: Lower FAR and FRR compared to pre-optimized versions
- Better Generalization: Enhanced performance on cross-day scenarios
- Optimized Architecture: Automatically tuned network parameters
- Time Domain: Good for temporal patterns
- Frequency Domain: Effective for spectral characteristics
- Combined Domain: Best overall performance
- Same Day: Higher accuracy due to reduced environmental variations
- Cross Day: More challenging but realistic scenario
- Automated data file discovery
- Structured data organization
- Memory-efficient loading
- Dynamic architecture calculation
- Configurable parameters
- Binary classification setup
- Adaptive training algorithms
- Progress monitoring
- Early stopping mechanisms
- Comprehensive metric calculation
- Statistical analysis
- Performance visualization
- Modular Design: Easy to extend and modify
- Error Handling: Robust error management
- Logging: Comprehensive execution logging
- Reproducibility: Consistent results across runs