GNSS Line-of-Sight Classification System

A comprehensive, production-ready machine learning system for classifying GNSS (Global Navigation Satellite System) signals as Line-of-Sight (LOS) or Non-Line-of-Sight (NLOS) using satellite signal characteristics. Features both a modern web interface and command-line pipeline capabilities.

Features

Core Capabilities

Multi-Model Training: Trains and compares 4 machine learning algorithms:
- Random Forest
- Gradient Boosting
- XGBoost
- Support Vector Machine (SVM)
Automated Feature Engineering: 20+ derived features including:
- Geometric features (elevation categories, azimuth sectors)
- Signal quality features (SNR trends, moving averages)
- Temporal features (hour of day, day of week, weekends)
- Interaction features (elevation-SNR, azimuth-SNR)
Comprehensive Data Preprocessing:
- Automatic outlier detection (IQR/Z-score methods)
- Missing value imputation
- Feature scaling and normalization
- PCA dimensionality reduction
Advanced Model Evaluation:
- Confusion matrices
- ROC curves with AUC scores
- Precision-Recall curves
- Feature importance plots
- SHAP values for model interpretability
Dual Interface:
- Modern web UI for non-technical users
- CLI pipeline for batch processing and automation

Web Application

Intuitive 4-step workflow
Real-time progress tracking
Interactive visualizations
Downloadable results and trained models
Support for Excel (.xlsx, .xls) file uploads
Maximum file size: 16MB

Demo

Web Interface Workflow

Upload Data: Upload your LOS and NLOS Excel files
Process: Automated data preprocessing and feature engineering
Download: Get processed training/testing datasets
Train & Evaluate: Train models and view performance metrics

Quick Start

# Clone the repository
git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION/gnss_classification

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the web application
python app.py

# Access at http://localhost:5000

Installation

Prerequisites

Python: 3.8 or higher
pip: Latest version recommended
Virtual environment: Strongly recommended

Step-by-Step Installation

1. Clone the Repository

git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION/gnss_classification

2. Create Virtual Environment

Linux/Mac:

python3 -m venv venv
source venv/bin/activate

Windows:

python -m venv venv
venv\Scripts\activate

3. Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

4. Verify Installation

python -c "import flask, pandas, sklearn, xgboost, shap; print('All dependencies installed successfully!')"

Usage

Web Interface

Starting the Server

python app.py

The server will start at http://localhost:5000 (default) or http://0.0.0.0:5000 for network access.

Workflow

Step 1: Upload Data Files

Navigate to http://localhost:5000
Select your LOS data Excel file
Select your NLOS data Excel file
Click "Upload Files"

Step 2: Process Data

Click "Process Data" after successful upload
System combines datasets and performs 80-20 train-test split
Features are automatically engineered and scaled

Step 3: Download Processed Data (Optional)

Download X_train.csv, y_train.csv for training set
Download X_test.csv, y_test.csv for testing set

Step 4: Train Models

Click "Train Models" to start training pipeline
View real-time performance metrics
Download comprehensive results including:
- Model comparison CSV
- Confusion matrices (PNG)
- ROC curves (PNG)
- Feature importance plots (PNG)

Command Line Interface

Basic Usage

python run_pipeline.py

This runs the complete pipeline:

Data preprocessing
Feature engineering
Model training
Model evaluation

Custom Data Path

# Modify run_pipeline.py to use custom paths
preprocessor = DataPreprocessor('path/to/your/data.csv')

Output Files

The CLI generates:

data/processed/: Cleaned and split datasets
data/processed/engineered/: Feature-engineered datasets
models/: Trained model files (.joblib)
results/: Evaluation metrics and visualizations

Data Format

Input Requirements

Your Excel files must contain the following columns:

Column	Type	Range	Description
`Year`	Integer	-	Year of observation
`Month`	Integer	1-12	Month of observation
`Date`	Integer	1-31	Day of month
`Hour`	Integer	0-23	Hour of day
`Min`	Integer	0-59	Minute
`Sec`	Float	0-59.999	Second
`PRN`	Integer	1-32	Satellite PRN number
`Elevation`	Float	0-90	Elevation angle (degrees)
`Azimuth`	Float	0-360	Azimuth angle (degrees)
`SNR`	Float	>0	Signal-to-Noise Ratio (dB-Hz)
`Label`	Integer	0 or 1	1 = LOS, 0 = NLOS

Example Data

See data/examples/sample_data.xlsx for a properly formatted example.

Supported File Formats

Excel: .xlsx, .xls
CSV: .csv (for CLI pipeline)

Machine Learning Pipeline

1. Data Preprocessing (`src/data_preprocessing.py`)

Loading: Reads Excel/CSV files
Validation: Ensures all required columns exist
Range Checks: Validates elevation (0-90°), azimuth (0-360°)
Missing Values: Median imputation for numerical, mode for categorical
Outlier Removal: IQR-based outlier detection
Temporal Features: Extracts hour_of_day, day_of_week, month, is_weekend
Scaling: StandardScaler normalization
Splitting: 80% training, 20% testing (stratified)

2. Feature Engineering (`src/feature_engineering.py`)

Geometric Features:

Elevation categories (low: <30°, medium: 30-60°, high: >60°)
Azimuth sectors (N, E, S, W)
Elevation-azimuth interaction

Signal Features:

SNR quartile categories
SNR rate of change
5-point moving average SNR

Satellite Features:

Satellite count per observation window
Satellite diversity (standard deviation)

Dimensionality Reduction:

PCA with 95% variance retention

3. Model Training (`src/model_training.py`)

Algorithms & Hyperparameters:

Model	Hyperparameters Tuned
Random Forest	n_estimators: [100, 200, 300] max_depth: [10, 20, None] min_samples_split: [2, 5, 10]
Gradient Boosting	n_estimators: [100, 200, 300] learning_rate: [0.01, 0.1, 0.2] max_depth: [3, 5, 7]
XGBoost	n_estimators: [100, 200, 300] learning_rate: [0.01, 0.1, 0.2] max_depth: [3, 5, 7]
SVM	C: [0.1, 1, 10] kernel: ['rbf', 'poly'] gamma: ['scale', 'auto']

Optimization:

GridSearchCV with 5-fold stratified cross-validation
F1-score as primary metric
Automatic best model selection

4. Model Evaluation (`src/evaluation.py`)

Metrics Calculated:

Accuracy
Precision
Recall
F1-Score
ROC AUC
Average Precision

Visualizations Generated:

Confusion matrices (heatmaps)
ROC curves with AUC scores
Precision-Recall curves
Feature importance rankings
SHAP summary plots for interpretability
Model comparison bar charts

Model Performance

Typical performance on GNSS classification tasks:

Model	Accuracy	Precision	Recall	F1-Score	AUC
Random Forest	~92%	~90%	~93%	~91%	~0.96
Gradient Boosting	~91%	~89%	~92%	~90%	~0.95
XGBoost	~93%	~91%	~94%	~92%	~0.97
SVM	~88%	~86%	~89%	~87%	~0.93

Note: Performance varies based on dataset quality and characteristics

API Documentation

Endpoints

`POST /upload`

Upload LOS and NLOS Excel files.

Request:

los_file: Excel file (multipart/form-data)
nlos_file: Excel file (multipart/form-data)

Response:

{
  "status": "success",
  "message": "Files uploaded successfully",
  "los_shape": [1000, 11],
  "nlos_shape": [800, 11]
}

`POST /process`

Process uploaded data (combine, split, engineer features).

Response:

{
  "status": "success",
  "train_size": 1440,
  "test_size": 360
}

`POST /train`

Train all models and generate evaluation metrics.

Response:

{
  "status": "success",
  "models_trained": ["random_forest", "gradient_boosting", "xgboost", "svm"],
  "best_model": "xgboost",
  "metrics": { ... }
}

`GET /download/<type>`

Download processed data or results.

Parameters:

type: train | test | results

Configuration

Environment Variables

Create a .env file for configuration:

# Flask Configuration
FLASK_ENV=development
FLASK_DEBUG=True
SECRET_KEY=your-secret-key-here

# Upload Configuration
MAX_CONTENT_LENGTH=16777216  # 16MB in bytes
UPLOAD_FOLDER=uploads/

# Model Configuration
MODEL_PATH=models/
RESULTS_PATH=results/

Application Settings

Modify config.py for advanced configuration:

Model hyperparameter grids
Feature engineering parameters
Data preprocessing thresholds
Logging levels

Project Structure

GNSS-CLASSIFICATION/
├── gnss_classification/
│   ├── app.py                      # Flask web application
│   ├── run_pipeline.py             # CLI pipeline orchestrator
│   ├── requirements.txt            # Python dependencies
│   ├── config.py                   # Configuration settings
│   ├── src/
│   │   ├── __init__.py
│   │   ├── data_preprocessing.py   # Data loading & preprocessing
│   │   ├── feature_engineering.py  # Feature creation & PCA
│   │   ├── model_training.py       # Model training & tuning
│   │   └── evaluation.py           # Model evaluation & visualization
│   ├── templates/
│   │   └── index.html              # Web UI template
│   ├── data/
│   │   ├── raw/                    # Raw input data
│   │   ├── processed/              # Processed datasets
│   │   └── examples/               # Example data files
│   ├── models/                     # Saved trained models (.joblib)
│   ├── results/                    # Evaluation results & plots
│   └── uploads/                    # Temporary file uploads
├── docs/                           # Detailed documentation
│   ├── API.md                      # API reference
│   ├── DATA_FORMAT.md              # Data format specifications
│   ├── TROUBLESHOOTING.md          # Common issues & solutions
│   └── DEVELOPMENT.md              # Development guide
├── tests/                          # Unit and integration tests
├── examples/                       # Usage examples & notebooks
│   └── example_usage.ipynb         # Jupyter notebook tutorial
├── .github/
│   └── workflows/
│       └── ci.yml                  # GitHub Actions CI/CD
├── .gitignore
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── CHANGELOG.md
└── setup.py                        # Package installation script

Development

Setting Up Development Environment

# Clone and install in editable mode
git clone https://github.com/yourusername/GNSS-CLASSIFICATION.git
cd GNSS-CLASSIFICATION
pip install -e .

# Install development dependencies
pip install -r requirements-dev.txt

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=gnss_classification --cov-report=html

Code Quality

# Format code
black gnss_classification/

# Lint code
flake8 gnss_classification/

# Type checking
mypy gnss_classification/

Troubleshooting

Common Issues

1. Import Errors

ImportError: No module named 'xgboost'

Solution: Ensure all dependencies are installed:

pip install -r requirements.txt

2. File Upload Fails

Error: File too large

Solution: Check file size (<16MB) or increase MAX_CONTENT_LENGTH in config.

3. Model Training Crashes

MemoryError during training

Solution:

Reduce hyperparameter grid size
Use smaller dataset
Increase system RAM

4. Missing Columns Error

KeyError: 'PRN'

Solution: Ensure Excel file has all required columns (see Data Format).

Full Troubleshooting Guide

See docs/TROUBLESHOOTING.md for comprehensive solutions.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick Contribution Guide

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Citation

If you use this code in your research, please cite:

@software{gnss_classification_2024,
  title = {GNSS Line-of-Sight Classification System},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/yourusername/GNSS-CLASSIFICATION}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Getting Help

Documentation: Check the docs/ directory
Issues: GitHub Issues
Discussions: GitHub Discussions

Contact

For questions or collaboration opportunities:

Email: your.email@example.com
LinkedIn: Your Profile

Acknowledgments

Developed for GNSS signal analysis and urban canyon navigation research
Built with scikit-learn, XGBoost, Flask, and other open-source libraries
Inspired by research in satellite positioning and machine learning

Made with ❤️ for the GNSS research community

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
gnss_classification		gnss_classification
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

License

8harath/GNSS-CLASSIFICATION

Folders and files

Latest commit

History

Repository files navigation

GNSS Line-of-Sight Classification System

Table of Contents

Features

Core Capabilities

Web Application

Demo

Web Interface Workflow

Quick Start

Installation

Prerequisites

Step-by-Step Installation

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Verify Installation

Usage

Web Interface

Starting the Server

Workflow

Command Line Interface

Basic Usage

Custom Data Path

Output Files

Data Format

Input Requirements

Example Data

Supported File Formats

Machine Learning Pipeline

1. Data Preprocessing (src/data_preprocessing.py)

2. Feature Engineering (src/feature_engineering.py)

3. Model Training (src/model_training.py)

4. Model Evaluation (src/evaluation.py)

Model Performance

API Documentation

Endpoints

POST /upload

POST /process

POST /train

GET /download/<type>

Configuration

Environment Variables

Application Settings

Project Structure

Development

Setting Up Development Environment

Running Tests

Code Quality

Troubleshooting

Common Issues

1. Import Errors

2. File Upload Fails

3. Model Training Crashes

4. Missing Columns Error

Full Troubleshooting Guide

Contributing

Quick Contribution Guide

Citation

License

Support

Getting Help

Contact

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

1. Data Preprocessing (`src/data_preprocessing.py`)

2. Feature Engineering (`src/feature_engineering.py`)

3. Model Training (`src/model_training.py`)

4. Model Evaluation (`src/evaluation.py`)

`POST /upload`

`POST /process`

`POST /train`

`GET /download/<type>`

Packages