ββββ ββββββββββββββββββββ βββββββ βββ βββ ββββββ βββββββ βββββββ ββββββ βββ
βββββ ββββββββββββββββββββββββββββ βββ βββββββββββββββββββββββββββ βββββββββββ
ββββββ βββββββββ βββ βββ βββββββ ββββββββββββββββββββββ βββ βββββββββββ
ββββββββββββββββ βββ βββ ββββββ ββββββββββββββββββββββ βββ βββββββββββ
βββ ββββββββββββββ βββ βββββββββββββββββββββ ββββββ βββββββββββ βββ ββββββ
βββ βββββββββββββ βββ βββββββ βββββββ βββ ββββββ ββββββββββ βββ ββββββ
Hybrid Machine Learning Β· Network Intrusion Detection Β· Real-Time SOC Dashboard
Built as a real-time SOC simulation system for cybersecurity research and education
NetGuard AI is a real-time, AI-powered Network Intrusion Detection System (IDS) that monitors live network traffic, detects anomalies using a hybrid machine learning engine, and presents threats through a professional Security Operations Center (SOC) dashboard.
Unlike traditional signature-based IDS tools, NetGuard AI uses unsupervised machine learning β it learns what normal traffic looks like on your own network and automatically flags deviations, without requiring a labelled attack dataset.
π‘ Live Traffic β π§ Hybrid ML β β‘ Confidence Score β π SOC Dashboard
| # | Feature | Description |
|---|---|---|
| 01 | π‘ Live Packet Capture | Real-time traffic sniffing via Scapy on configurable network interface |
| 02 | π§ Hybrid ML Detection | Isolation Forest + One-Class SVM working in parallel |
| 03 | β‘ 3-Tier Confidence Scoring | HIGH / MEDIUM / LOW based on dual-model agreement and anomaly score |
| 04 | π Rule-Based Classification | Pattern engine labels attack types β Flooding, Scan, Exfiltration |
| 05 | π§Ύ SQLite Event Logging | All detections persisted with full schema and automatic migration |
| 06 | π Professional SOC Dashboard | Live charts, alert feed, model metrics, and threat timeline |
| 07 | π¨ Attack Simulator | Built-in DDoS / burst traffic generator for testing and demonstration |
| 08 | π Audit Log File | All IDS events printed to terminal with timestamps for real-time review |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LIVE NETWORK TRAFFIC β
β (1-second sliding windows) β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SCAPY CAPTURE β
β Bound to wlan0 / eth0 interface β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FEATURE EXTRACTION β
β packet_count Β· avg_size Β· max_size Β· src_ips Β· dst_ips β
ββββββββββββββββ¬βββββββββββββββββββββββββββ¬ββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ
β ISOLATION FOREST β β ONE-CLASS SVM β
β Anomaly scoring β β Boundary classification β
β (raw features) β β (StandardScaler applied) β
ββββββββββββ¬ββββββββββββ ββββββββββββββββ¬βββββββββββββββ
β β
ββββββββββββββββ¬βββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HYBRID CONFIDENCE ASSIGNMENT β
β π΄ HIGH Β· π MEDIUM Β· π’ LOW β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RULE-BASED ATTACK CLASSIFICATION β
β Flooding Β· Network Scan Β· Exfiltration Β· Burst β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SQLITE DATABASE β
β Persistent event log with full schema β
βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STREAMLIT SOC DASHBOARD β
β Auto-refreshes every 1 second β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
NetGuard AI builds its own training dataset from real traffic captured on your machine β no external datasets or manual labelling required.
Packet Capture β Feature Extraction β Dataset Merging β Model Training
sudo python src/capture_single_device.py
Captures raw packet metadata (source/destination IPs, protocol, size) over a configurable duration and saves to data/raw/. Run multiple times under different conditions β idle, browsing, streaming, downloading β to build a diverse normal-traffic baseline.
python src/feature_extractor.py
Aggregates raw packets into 1-second feature windows:
| Feature | Description |
|---|---|
packet_count |
Total packets observed in the window |
avg_packet_size |
Mean packet size in bytes |
max_packet_size |
Largest single packet observed |
unique_src_ips |
Number of distinct source IP addresses |
unique_dst_ips |
Number of distinct destination IP addresses |
Output saved to data/processed/.
python src/merge_datasets.py
Combines all processed feature files, appends source labels for traceability, shuffles for randomness, and saves the final training dataset to:
data/processed/combined_features.csv
Why multiple captures? Training on traffic from different real-world conditions teaches the model what normal looks like across a variety of scenarios β significantly reducing false positives during live detection.
| Layer | Technology | Purpose |
|---|---|---|
| Language | Python 3.9+ | Core runtime |
| Anomaly Detection | Isolation Forest | Unsupervised scoring |
| Boundary Validation | One-Class SVM | Decision boundary enforcement |
| Feature Scaling | StandardScaler | Normalise features for SVM |
| Packet Capture | Scapy | Live network sniffing |
| Dashboard | Streamlit + Altair | Visualisation and UI |
| Database | SQLite3 | Event persistence |
| Model Persistence | Joblib | Save and load .pkl models |
| Data Processing | Pandas + NumPy | Feature engineering |
NetGuard-AI/
β
βββ src/
β βββ isolation_forest_model.py β Live IDS engine (main detection loop)
β βββ train_isolation_forest.py β Train and save Isolation Forest
β βββ train_ocsvm.py β Train and save One-Class SVM + scaler
β βββ capture_single_device.py β Raw packet capture to CSV
β βββ feature_extractor.py β Per-second feature engineering
β βββ merge_datasets.py β Combine feature CSVs for training
β βββ db_manager.py β SQLite init, migration, fast insert
β βββ suggestion_engine.py β Rule-based attack type classifier
β βββ device_profiler.py β Network profile ID generator
β βββ dashboard.py β Streamlit SOC dashboard
β βββ attack_simulator.py β Traffic attack simulator
β
βββ data/
β βββ raw/ β Raw packet CSVs from capture
β βββ processed/ β Feature CSVs + combined dataset
β
βββ models/ β Auto-generated after training
β βββ isolation_forest.pkl
β βββ ocsvm_model.pkl
β βββ ocsvm_scaler.pkl
β
βββ netguard.db β SQLite event log (auto-created)
βββ requirements.txt
βββ README.md
- Python 3.9 or higher
- Root / sudo access (required for packet capture)
- Linux or macOS recommended (Scapy has limited Windows support)
1 Β· Clone the repository
git clone https://github.com/deathbyginger64/NetGuard-AI.git
cd NetGuard-AI
2 Β· Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate
3 Β· Install all dependencies
pip install -r requirements.txt
4 Β· Capture normal network traffic (run 2β3 times under different conditions)
sudo venv/bin/python src/capture_single_device.py
5 Β· Extract features from captured data
python src/feature_extractor.py
6 Β· Merge all captured datasets
python src/merge_datasets.py
7 Β· Train Isolation Forest
python src/train_isolation_forest.py
8 Β· Train One-Class SVM
python src/train_ocsvm.py
9 Β· Start the live IDS engine
Ensure the
models/folder contains trained.pklfiles before starting. Run steps 7 and 8 first if you haven't already.
sudo venv/bin/python src/isolation_forest_model.py
10 Β· Launch the SOC dashboard (in a separate terminal)
streamlit run src/dashboard.py
11 Β· Run attack simulation (optional β for testing and demonstration)
sudo venv/bin/python src/attack_simulator.py
| Condition | Confidence | Meaning |
|---|---|---|
Both models flag anomaly and IF score < β0.1 |
π΄ HIGH | Full model agreement with strong anomaly score |
| Both models flag anomaly with borderline IF score | π MEDIUM | Agreement without strong signal β suspicious |
| Rule engine threshold breached | π΄ HIGH | Clear behavioural anomaly β rule override |
| Only one model flags anomaly | π MEDIUM | Partial signal β possible noise or edge case |
| Both models report normal | π’ LOW | Traffic within expected parameters |
| Trigger | Threshold | Alert Label |
|---|---|---|
| Packet volume spike | packet_count > 100 / sec |
Possible Flooding Attack |
| Wide destination spread | unique_dst_ips > 10 + packet_count > 30 |
Possible Network Scan |
| Large payload, low volume | avg_size > 700B + packet_count < 30 |
Possible Data Exfiltration |
| Default fallback | β | Suspicious Burst Traffic |
The SOC dashboard auto-refreshes every second and provides full operational visibility:
| Panel | Description |
|---|---|
| π― Hero Card | System status, active alert count, and current threat rate |
| π¨ Threat Banner | CRITICAL / ELEVATED / LOW β driven by anomaly rate + avg risk score |
| π‘ Live Telemetry | Packets/sec, anomaly score, confidence level, and alert type |
| π Threat Timeline | Full session plotted with colour-coded Normal vs Anomaly events |
| π₯ Threat Intelligence | Attack type distribution chart and live alert feed cards |
| π€ Model Intelligence | Confidence distribution, model agreement %, score scatter plot |
| π Model Evaluation | TP, FP, detection rate, and false positive rate |
| βοΈ Health Indicators | Real-time IDS engine, database, and ingestion status |
| π Event Log | Last 100 records with anomaly rows highlighted in red |
This is an unsupervised learning system β no labelled attack data is required
- The model learns normal from your own captured traffic
- False Negatives (FN) are approximated as 0 β standard practice for unsupervised IDS
- Detection metrics (TP, FP, precision, detection rate) are estimated using model confidence levels as a proxy
- Change the network interface (
wlan0/eth0) in capture and IDS scripts to match your system's active interface - Root / sudo is required for Scapy to access the network interface
β οΈ Note: This system is trained on self-captured network traffic to simulate real-world anomaly detection scenarios.
- Network anomaly detection in research and home lab environments
- Cybersecurity final year and capstone projects
- SOC simulation and live demonstration to technical panels
- Benchmarking unsupervised ML approaches to network intrusion detection
- Deep learning integration β LSTM and Autoencoder models for temporal anomaly detection
- Docker containerisation for portable, reproducible deployment
- Multi-device and multi-interface simultaneous monitoring
- Integration with benchmark datasets β KDD Cup 99, CICIDS 2017/2018
- Real-time email and SMS alerting on CRITICAL threat level
- Exportable PDF incident reports generated from the dashboard
- REST API endpoint for external SIEM tool integration
| Aditya Khandelwal | Astha Chakraborty |
|---|---|
| B.Tech CSE β Cyber Security | B.Tech CSE β Cyber Security |
Built for educational and research purposes Β· NetGuard AI