Skip to content

deathbyginger64/NetGaurd-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


β–ˆβ–ˆβ–ˆβ•—   β–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—   β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—
β–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•β•β•β•šβ•β•β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•”β•β•β•β•β• β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—    β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘
β–ˆβ–ˆβ•”β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘    β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—   β–ˆβ–ˆβ•‘   β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•    β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
β•šβ•β•  β•šβ•β•β•β•β•šβ•β•β•β•β•β•β•   β•šβ•β•    β•šβ•β•β•β•β•β•  β•šβ•β•β•β•β•β• β•šβ•β•  β•šβ•β•β•šβ•β•  β•šβ•β•β•šβ•β•β•β•β•β•     β•šβ•β•  β•šβ•β•β•šβ•β•

Hybrid Machine Learning Β· Network Intrusion Detection Β· Real-Time SOC Dashboard

Built as a real-time SOC simulation system for cybersecurity research and education


Python Scikit-learn Streamlit Scapy SQLite


Status Type Domain License



πŸ“Œ What is NetGuard AI?

NetGuard AI is a real-time, AI-powered Network Intrusion Detection System (IDS) that monitors live network traffic, detects anomalies using a hybrid machine learning engine, and presents threats through a professional Security Operations Center (SOC) dashboard.

Unlike traditional signature-based IDS tools, NetGuard AI uses unsupervised machine learning β€” it learns what normal traffic looks like on your own network and automatically flags deviations, without requiring a labelled attack dataset.


  πŸ“‘ Live Traffic  β†’  🧠 Hybrid ML  β†’  ⚑ Confidence Score  β†’  πŸ“Š SOC Dashboard


✨ Key Features


# Feature Description
01 πŸ“‘ Live Packet Capture Real-time traffic sniffing via Scapy on configurable network interface
02 🧠 Hybrid ML Detection Isolation Forest + One-Class SVM working in parallel
03 ⚑ 3-Tier Confidence Scoring HIGH / MEDIUM / LOW based on dual-model agreement and anomaly score
04 πŸ” Rule-Based Classification Pattern engine labels attack types β€” Flooding, Scan, Exfiltration
05 🧾 SQLite Event Logging All detections persisted with full schema and automatic migration
06 πŸ“Š Professional SOC Dashboard Live charts, alert feed, model metrics, and threat timeline
07 🚨 Attack Simulator Built-in DDoS / burst traffic generator for testing and demonstration
08 πŸ“ Audit Log File All IDS events printed to terminal with timestamps for real-time review


πŸ—οΈ System Architecture


β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   LIVE NETWORK TRAFFIC                  β”‚
β”‚              (1-second sliding windows)                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    SCAPY CAPTURE                        β”‚
β”‚         Bound to wlan0 / eth0 interface                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  FEATURE EXTRACTION                     β”‚
β”‚  packet_count Β· avg_size Β· max_size Β· src_ips Β· dst_ips β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚                          β”‚
               β–Ό                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   ISOLATION FOREST   β”‚    β”‚       ONE-CLASS SVM          β”‚
β”‚   Anomaly scoring    β”‚    β”‚   Boundary classification    β”‚
β”‚   (raw features)     β”‚    β”‚   (StandardScaler applied)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                               β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              HYBRID CONFIDENCE ASSIGNMENT               β”‚
β”‚                 πŸ”΄ HIGH Β· 🟠 MEDIUM Β· 🟒 LOW             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚             RULE-BASED ATTACK CLASSIFICATION            β”‚
β”‚    Flooding Β· Network Scan Β· Exfiltration Β· Burst       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   SQLITE DATABASE                       β”‚
β”‚         Persistent event log with full schema           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              STREAMLIT SOC DASHBOARD                    β”‚
β”‚           Auto-refreshes every 1 second                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


πŸ“¦ Dataset Preparation Pipeline

NetGuard AI builds its own training dataset from real traffic captured on your machine β€” no external datasets or manual labelling required.


Packet Capture  β†’  Feature Extraction  β†’  Dataset Merging  β†’  Model Training

πŸ”Ή Step 1 β€” Packet Capture

sudo python src/capture_single_device.py

Captures raw packet metadata (source/destination IPs, protocol, size) over a configurable duration and saves to data/raw/. Run multiple times under different conditions β€” idle, browsing, streaming, downloading β€” to build a diverse normal-traffic baseline.


πŸ”Ή Step 2 β€” Feature Extraction

python src/feature_extractor.py

Aggregates raw packets into 1-second feature windows:

Feature Description
packet_count Total packets observed in the window
avg_packet_size Mean packet size in bytes
max_packet_size Largest single packet observed
unique_src_ips Number of distinct source IP addresses
unique_dst_ips Number of distinct destination IP addresses

Output saved to data/processed/.


πŸ”Ή Step 3 β€” Dataset Merging

python src/merge_datasets.py

Combines all processed feature files, appends source labels for traceability, shuffles for randomness, and saves the final training dataset to:

data/processed/combined_features.csv

Why multiple captures? Training on traffic from different real-world conditions teaches the model what normal looks like across a variety of scenarios β€” significantly reducing false positives during live detection.



βš™οΈ Tech Stack


Layer Technology Purpose
Language Python 3.9+ Core runtime
Anomaly Detection Isolation Forest Unsupervised scoring
Boundary Validation One-Class SVM Decision boundary enforcement
Feature Scaling StandardScaler Normalise features for SVM
Packet Capture Scapy Live network sniffing
Dashboard Streamlit + Altair Visualisation and UI
Database SQLite3 Event persistence
Model Persistence Joblib Save and load .pkl models
Data Processing Pandas + NumPy Feature engineering


πŸ“ Project Structure

NetGuard-AI/
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ isolation_forest_model.py   ← Live IDS engine (main detection loop)
β”‚   β”œβ”€β”€ train_isolation_forest.py   ← Train and save Isolation Forest
β”‚   β”œβ”€β”€ train_ocsvm.py              ← Train and save One-Class SVM + scaler
β”‚   β”œβ”€β”€ capture_single_device.py    ← Raw packet capture to CSV
β”‚   β”œβ”€β”€ feature_extractor.py        ← Per-second feature engineering
β”‚   β”œβ”€β”€ merge_datasets.py           ← Combine feature CSVs for training
β”‚   β”œβ”€β”€ db_manager.py               ← SQLite init, migration, fast insert
β”‚   β”œβ”€β”€ suggestion_engine.py        ← Rule-based attack type classifier
β”‚   β”œβ”€β”€ device_profiler.py          ← Network profile ID generator
β”‚   β”œβ”€β”€ dashboard.py                ← Streamlit SOC dashboard
β”‚   └── attack_simulator.py         ← Traffic attack simulator
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                        ← Raw packet CSVs from capture
β”‚   └── processed/                  ← Feature CSVs + combined dataset
β”‚
β”œβ”€β”€ models/                         ← Auto-generated after training
β”‚   β”œβ”€β”€ isolation_forest.pkl
β”‚   β”œβ”€β”€ ocsvm_model.pkl
β”‚   └── ocsvm_scaler.pkl
β”‚
β”œβ”€β”€ netguard.db                     ← SQLite event log (auto-created)
β”œβ”€β”€ requirements.txt
└── README.md


▢️ How to Run

Prerequisites

  • Python 3.9 or higher
  • Root / sudo access (required for packet capture)
  • Linux or macOS recommended (Scapy has limited Windows support)

Setup

1 Β· Clone the repository

git clone https://github.com/deathbyginger64/NetGuard-AI.git
cd NetGuard-AI

2 Β· Create and activate a virtual environment

python3 -m venv venv
source venv/bin/activate

3 Β· Install all dependencies

pip install -r requirements.txt

Build the Training Dataset

4 Β· Capture normal network traffic (run 2–3 times under different conditions)

sudo venv/bin/python src/capture_single_device.py

5 Β· Extract features from captured data

python src/feature_extractor.py

6 Β· Merge all captured datasets

python src/merge_datasets.py

Train the Models

7 Β· Train Isolation Forest

python src/train_isolation_forest.py

8 Β· Train One-Class SVM

python src/train_ocsvm.py

Run the System

9 Β· Start the live IDS engine

Ensure the models/ folder contains trained .pkl files before starting. Run steps 7 and 8 first if you haven't already.

sudo venv/bin/python src/isolation_forest_model.py

10 Β· Launch the SOC dashboard (in a separate terminal)

streamlit run src/dashboard.py

11 Β· Run attack simulation (optional β€” for testing and demonstration)

sudo venv/bin/python src/attack_simulator.py


🧠 Detection Logic

Confidence Assignment Rules

Condition Confidence Meaning
Both models flag anomaly and IF score < βˆ’0.1 πŸ”΄ HIGH Full model agreement with strong anomaly score
Both models flag anomaly with borderline IF score 🟠 MEDIUM Agreement without strong signal β€” suspicious
Rule engine threshold breached πŸ”΄ HIGH Clear behavioural anomaly β€” rule override
Only one model flags anomaly 🟠 MEDIUM Partial signal β€” possible noise or edge case
Both models report normal 🟒 LOW Traffic within expected parameters

Rule Engine Thresholds

Trigger Threshold Alert Label
Packet volume spike packet_count > 100 / sec Possible Flooding Attack
Wide destination spread unique_dst_ips > 10 + packet_count > 30 Possible Network Scan
Large payload, low volume avg_size > 700B + packet_count < 30 Possible Data Exfiltration
Default fallback β€” Suspicious Burst Traffic


πŸ“Š Dashboard Capabilities

The SOC dashboard auto-refreshes every second and provides full operational visibility:


Panel Description
🎯 Hero Card System status, active alert count, and current threat rate
🚨 Threat Banner CRITICAL / ELEVATED / LOW β€” driven by anomaly rate + avg risk score
πŸ“‘ Live Telemetry Packets/sec, anomaly score, confidence level, and alert type
πŸ“ˆ Threat Timeline Full session plotted with colour-coded Normal vs Anomaly events
πŸ”₯ Threat Intelligence Attack type distribution chart and live alert feed cards
πŸ€– Model Intelligence Confidence distribution, model agreement %, score scatter plot
πŸ“‹ Model Evaluation TP, FP, detection rate, and false positive rate
βš™οΈ Health Indicators Real-time IDS engine, database, and ingestion status
πŸ“ Event Log Last 100 records with anomaly rows highlighted in red


⚠️ Important Notes

This is an unsupervised learning system β€” no labelled attack data is required
  • The model learns normal from your own captured traffic
  • False Negatives (FN) are approximated as 0 β€” standard practice for unsupervised IDS
  • Detection metrics (TP, FP, precision, detection rate) are estimated using model confidence levels as a proxy
  • Change the network interface (wlan0 / eth0) in capture and IDS scripts to match your system's active interface
  • Root / sudo is required for Scapy to access the network interface
  • ⚠️ Note: This system is trained on self-captured network traffic to simulate real-world anomaly detection scenarios.


🎯 Use Cases

  • Network anomaly detection in research and home lab environments
  • Cybersecurity final year and capstone projects
  • SOC simulation and live demonstration to technical panels
  • Benchmarking unsupervised ML approaches to network intrusion detection


πŸ”­ Future Improvements

  • Deep learning integration β€” LSTM and Autoencoder models for temporal anomaly detection
  • Docker containerisation for portable, reproducible deployment
  • Multi-device and multi-interface simultaneous monitoring
  • Integration with benchmark datasets β€” KDD Cup 99, CICIDS 2017/2018
  • Real-time email and SMS alerting on CRITICAL threat level
  • Exportable PDF incident reports generated from the dashboard
  • REST API endpoint for external SIEM tool integration


πŸ‘¨β€πŸ’» Authors


Aditya Khandelwal Astha Chakraborty
B.Tech CSE β€” Cyber Security B.Tech CSE β€” Cyber Security


Built for educational and research purposes Β· NetGuard AI

About

AI-powered Network Intrusion Detection System (IDS) with real-time packet capture, anomaly detection (Isolation Forest + One-Class SVM), and SOC dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages