Skip to content

8harath/GNSS-CLASSIFICATION

Repository files navigation

GNSS Line-of-Sight Classification System

Python Flask scikit-learn License Code style: black

A comprehensive, production-ready machine learning system for classifying GNSS (Global Navigation Satellite System) signals as Line-of-Sight (LOS) or Non-Line-of-Sight (NLOS) using satellite signal characteristics. Features both a modern web interface and command-line pipeline capabilities.


Table of Contents


Features

Core Capabilities

  • Multi-Model Training: Trains and compares 4 machine learning algorithms:
    • Random Forest
    • Gradient Boosting
    • XGBoost
    • Support Vector Machine (SVM)
  • Automated Feature Engineering: 20+ derived features including:
    • Geometric features (elevation categories, azimuth sectors)
    • Signal quality features (SNR trends, moving averages)
    • Temporal features (hour of day, day of week, weekends)
    • Interaction features (elevation-SNR, azimuth-SNR)
  • Comprehensive Data Preprocessing:
    • Automatic outlier detection (IQR/Z-score methods)
    • Missing value imputation
    • Feature scaling and normalization
    • PCA dimensionality reduction
  • Advanced Model Evaluation:
    • Confusion matrices
    • ROC curves with AUC scores
    • Precision-Recall curves
    • Feature importance plots
    • SHAP values for model interpretability
  • Dual Interface:
    • Modern web UI for non-technical users
    • CLI pipeline for batch processing and automation

Web Application

  • Intuitive 4-step workflow
  • Real-time progress tracking
  • Interactive visualizations
  • Downloadable results and trained models
  • Support for Excel (.xlsx, .xls) file uploads
  • Maximum file size: 16MB

Demo

Web Interface Workflow

  1. Upload Data: Upload your LOS and NLOS Excel files
  2. Process: Automated data preprocessing and feature engineering
  3. Download: Get processed training/testing datasets
  4. Train & Evaluate: Train models and view performance metrics

Quick Start

# Clone the repository
git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION/gnss_classification

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the web application
python app.py

# Access at http://localhost:5000

Installation

Prerequisites

  • Python: 3.8 or higher
  • pip: Latest version recommended
  • Virtual environment: Strongly recommended

Step-by-Step Installation

1. Clone the Repository

git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION/gnss_classification

2. Create Virtual Environment

Linux/Mac:

python3 -m venv venv
source venv/bin/activate

Windows:

python -m venv venv
venv\Scripts\activate

3. Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

4. Verify Installation

python -c "import flask, pandas, sklearn, xgboost, shap; print('All dependencies installed successfully!')"

Usage

Web Interface

Starting the Server

python app.py

The server will start at http://localhost:5000 (default) or http://0.0.0.0:5000 for network access.

Workflow

Step 1: Upload Data Files

  • Navigate to http://localhost:5000
  • Select your LOS data Excel file
  • Select your NLOS data Excel file
  • Click "Upload Files"

Step 2: Process Data

  • Click "Process Data" after successful upload
  • System combines datasets and performs 80-20 train-test split
  • Features are automatically engineered and scaled

Step 3: Download Processed Data (Optional)

  • Download X_train.csv, y_train.csv for training set
  • Download X_test.csv, y_test.csv for testing set

Step 4: Train Models

  • Click "Train Models" to start training pipeline
  • View real-time performance metrics
  • Download comprehensive results including:
    • Model comparison CSV
    • Confusion matrices (PNG)
    • ROC curves (PNG)
    • Feature importance plots (PNG)

Command Line Interface

Basic Usage

python run_pipeline.py

This runs the complete pipeline:

  1. Data preprocessing
  2. Feature engineering
  3. Model training
  4. Model evaluation

Custom Data Path

# Modify run_pipeline.py to use custom paths
preprocessor = DataPreprocessor('path/to/your/data.csv')

Output Files

The CLI generates:

  • data/processed/: Cleaned and split datasets
  • data/processed/engineered/: Feature-engineered datasets
  • models/: Trained model files (.joblib)
  • results/: Evaluation metrics and visualizations

Data Format

Input Requirements

Your Excel files must contain the following columns:

Column Type Range Description
Year Integer - Year of observation
Month Integer 1-12 Month of observation
Date Integer 1-31 Day of month
Hour Integer 0-23 Hour of day
Min Integer 0-59 Minute
Sec Float 0-59.999 Second
PRN Integer 1-32 Satellite PRN number
Elevation Float 0-90 Elevation angle (degrees)
Azimuth Float 0-360 Azimuth angle (degrees)
SNR Float >0 Signal-to-Noise Ratio (dB-Hz)
Label Integer 0 or 1 1 = LOS, 0 = NLOS

Example Data

See data/examples/sample_data.xlsx for a properly formatted example.

Supported File Formats

  • Excel: .xlsx, .xls
  • CSV: .csv (for CLI pipeline)

Machine Learning Pipeline

1. Data Preprocessing (src/data_preprocessing.py)

  • Loading: Reads Excel/CSV files
  • Validation: Ensures all required columns exist
  • Range Checks: Validates elevation (0-90°), azimuth (0-360°)
  • Missing Values: Median imputation for numerical, mode for categorical
  • Outlier Removal: IQR-based outlier detection
  • Temporal Features: Extracts hour_of_day, day_of_week, month, is_weekend
  • Scaling: StandardScaler normalization
  • Splitting: 80% training, 20% testing (stratified)

2. Feature Engineering (src/feature_engineering.py)

Geometric Features:

  • Elevation categories (low: <30°, medium: 30-60°, high: >60°)
  • Azimuth sectors (N, E, S, W)
  • Elevation-azimuth interaction

Signal Features:

  • SNR quartile categories
  • SNR rate of change
  • 5-point moving average SNR

Satellite Features:

  • Satellite count per observation window
  • Satellite diversity (standard deviation)

Dimensionality Reduction:

  • PCA with 95% variance retention

3. Model Training (src/model_training.py)

Algorithms & Hyperparameters:

Model Hyperparameters Tuned
Random Forest n_estimators: [100, 200, 300]
max_depth: [10, 20, None]
min_samples_split: [2, 5, 10]
Gradient Boosting n_estimators: [100, 200, 300]
learning_rate: [0.01, 0.1, 0.2]
max_depth: [3, 5, 7]
XGBoost n_estimators: [100, 200, 300]
learning_rate: [0.01, 0.1, 0.2]
max_depth: [3, 5, 7]
SVM C: [0.1, 1, 10]
kernel: ['rbf', 'poly']
gamma: ['scale', 'auto']

Optimization:

  • GridSearchCV with 5-fold stratified cross-validation
  • F1-score as primary metric
  • Automatic best model selection

4. Model Evaluation (src/evaluation.py)

Metrics Calculated:

  • Accuracy
  • Precision
  • Recall
  • F1-Score
  • ROC AUC
  • Average Precision

Visualizations Generated:

  • Confusion matrices (heatmaps)
  • ROC curves with AUC scores
  • Precision-Recall curves
  • Feature importance rankings
  • SHAP summary plots for interpretability
  • Model comparison bar charts

Model Performance

Typical performance on GNSS classification tasks:

Model Accuracy Precision Recall F1-Score AUC
Random Forest ~92% ~90% ~93% ~91% ~0.96
Gradient Boosting ~91% ~89% ~92% ~90% ~0.95
XGBoost ~93% ~91% ~94% ~92% ~0.97
SVM ~88% ~86% ~89% ~87% ~0.93

Note: Performance varies based on dataset quality and characteristics


API Documentation

Endpoints

POST /upload

Upload LOS and NLOS Excel files.

Request:

  • los_file: Excel file (multipart/form-data)
  • nlos_file: Excel file (multipart/form-data)

Response:

{
  "status": "success",
  "message": "Files uploaded successfully",
  "los_shape": [1000, 11],
  "nlos_shape": [800, 11]
}

POST /process

Process uploaded data (combine, split, engineer features).

Response:

{
  "status": "success",
  "train_size": 1440,
  "test_size": 360
}

POST /train

Train all models and generate evaluation metrics.

Response:

{
  "status": "success",
  "models_trained": ["random_forest", "gradient_boosting", "xgboost", "svm"],
  "best_model": "xgboost",
  "metrics": { ... }
}

GET /download/<type>

Download processed data or results.

Parameters:

  • type: train | test | results

Configuration

Environment Variables

Create a .env file for configuration:

# Flask Configuration
FLASK_ENV=development
FLASK_DEBUG=True
SECRET_KEY=your-secret-key-here

# Upload Configuration
MAX_CONTENT_LENGTH=16777216  # 16MB in bytes
UPLOAD_FOLDER=uploads/

# Model Configuration
MODEL_PATH=models/
RESULTS_PATH=results/

Application Settings

Modify config.py for advanced configuration:

  • Model hyperparameter grids
  • Feature engineering parameters
  • Data preprocessing thresholds
  • Logging levels

Project Structure

GNSS-CLASSIFICATION/
├── gnss_classification/
│   ├── app.py                      # Flask web application
│   ├── run_pipeline.py             # CLI pipeline orchestrator
│   ├── requirements.txt            # Python dependencies
│   ├── config.py                   # Configuration settings
│   ├── src/
│   │   ├── __init__.py
│   │   ├── data_preprocessing.py   # Data loading & preprocessing
│   │   ├── feature_engineering.py  # Feature creation & PCA
│   │   ├── model_training.py       # Model training & tuning
│   │   └── evaluation.py           # Model evaluation & visualization
│   ├── templates/
│   │   └── index.html              # Web UI template
│   ├── data/
│   │   ├── raw/                    # Raw input data
│   │   ├── processed/              # Processed datasets
│   │   └── examples/               # Example data files
│   ├── models/                     # Saved trained models (.joblib)
│   ├── results/                    # Evaluation results & plots
│   └── uploads/                    # Temporary file uploads
├── docs/                           # Detailed documentation
│   ├── API.md                      # API reference
│   ├── DATA_FORMAT.md              # Data format specifications
│   ├── TROUBLESHOOTING.md          # Common issues & solutions
│   └── DEVELOPMENT.md              # Development guide
├── tests/                          # Unit and integration tests
├── examples/                       # Usage examples & notebooks
│   └── example_usage.ipynb         # Jupyter notebook tutorial
├── .github/
│   └── workflows/
│       └── ci.yml                  # GitHub Actions CI/CD
├── .gitignore
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── CHANGELOG.md
└── setup.py                        # Package installation script

Development

Setting Up Development Environment

# Clone and install in editable mode
git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION
pip install -e .

# Install development dependencies
pip install -r requirements-dev.txt

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=gnss_classification --cov-report=html

Code Quality

# Format code
black gnss_classification/

# Lint code
flake8 gnss_classification/

# Type checking
mypy gnss_classification/

Troubleshooting

Common Issues

1. Import Errors

ImportError: No module named 'xgboost'

Solution: Ensure all dependencies are installed:

pip install -r requirements.txt

2. File Upload Fails

Error: File too large

Solution: Check file size (<16MB) or increase MAX_CONTENT_LENGTH in config.

3. Model Training Crashes

MemoryError during training

Solution:

  • Reduce hyperparameter grid size
  • Use smaller dataset
  • Increase system RAM

4. Missing Columns Error

KeyError: 'PRN'

Solution: Ensure Excel file has all required columns (see Data Format).

Full Troubleshooting Guide

See docs/TROUBLESHOOTING.md for comprehensive solutions.


Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick Contribution Guide

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Citation

If you use this code in your research, please cite:

@software{gnss_classification_2024,
  title = {GNSS Line-of-Sight Classification System},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/yourusername/GNSS-CLASSIFICATION}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.


Support

Getting Help

Contact

For questions or collaboration opportunities:


Acknowledgments

  • Developed for GNSS signal analysis and urban canyon navigation research
  • Built with scikit-learn, XGBoost, Flask, and other open-source libraries
  • Inspired by research in satellite positioning and machine learning

Made with ❤️ for the GNSS research community

Star this repo

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages