CGRAG: CyberGuard RAG

CGRAG (CyberGuard RAG) is an AI-powered cybersecurity threat detection and analysis system built on top of the Qdrant vector database, using Retrieval-Augmented Generation (RAG) techniques.

It enables fast similarity search across malware signatures, real-time detection of network anomalies, and intelligent querying of cyber threat intelligence (like CVEs), all backed by powerful language models.

Features

Malware Similarity Detection Identify similarity between incoming files and known malware signatures using vector embeddings.
Network Anomaly Detection Learn typical network behavior and instantly detect abnormal or malicious traffic.
Cyber Threat Intelligence Retrieval Use semantic search to extract relevant threat data from CVE datasets or other intelligence sources.
Real-Time RAG-Based Threat Analysis Combine retrieval and generation for contextual security awareness and automated reasoning.

Quick Start

1. Clone & Install Dependencies

bash git clone https://github.com/annoeyed/CGRAG.git cd CGRAG pip install -r requirements.txt

2. Launch Qdrant

bash docker-compose up -d

3. Prepare Data

First, create the necessary collections in Qdrant. bash python scripts/create_collections.py Next, fetch the initial datasets. bash python scripts/scripts/fetch_cve_from_nvd.py python scripts/fetch_network_logs.py Finally, load the data into Qdrant. bash python scripts/load_data_to_qdrant.py

4. Run the System

bash python main.py

Usage Examples

You can test the system by running the test_query.py script. It allows you to pass a query directly from the command line.

Malware Analysis

bash python tests/test_query.py "Analyze ransomware similar to WannaCry"

CVE Information Retrieval

bash python tests/test_query.py "Find vulnerabilities related to Apache Log4j"

Architecture

Qdrant – High-performance vector DB for storing and querying embeddings
Sentence Transformers – Semantic embedding of malware descriptions, logs, CVE entries
Custom Security Modules – Malware classification, anomaly detection, CVE similarity search

Performance (Sample)

Feature	Result
Malware Detection Acc.	94%+
Query Latency	< 100 ms
Anomaly Detection	Real-time

Contributing

Fork the repo
Create a new branch: bash git checkout -b feature/YourFeature
Commit your changes: bash git commit -m "Add YourFeature"
Push to Github: bash git push origin feature/YourFeature
Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for full details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
config		config
data		data
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CGRAG: CyberGuard RAG

Features

Quick Start

1. Clone & Install Dependencies

2. Launch Qdrant

3. Prepare Data

4. Run the System

Usage Examples

Malware Analysis

CVE Information Retrieval

Architecture

Performance (Sample)

Contributing

License

About

Uh oh!

Releases

Packages

Languages

annoeyed/CGRAG

Folders and files

Latest commit

History

Repository files navigation

CGRAG: CyberGuard RAG

Features

Quick Start

1. Clone & Install Dependencies

2. Launch Qdrant

3. Prepare Data

4. Run the System

Usage Examples

Malware Analysis

CVE Information Retrieval

Architecture

Performance (Sample)

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages