A Cognitive Purple Agent Framework for Autonomous Adversarial Simulation and Real-Time SIEM Validation
"VANGUARD transforms Breach and Attack Simulation (BAS) from static, deterministic playbooks into a dynamic, mathematically validated Cyber Wargaming arena."
Read the full academic preprint: 10.5281/zenodo.18846075
The rapid evolution of Advanced Persistent Threats (APTs) severely outpaces the scaling capabilities of static Security Operations Centers (SOCs). Conventional penetration testing and "dumb" replay engines execute known Indicators of Compromise (IoCs) but fail to emulate the adaptive, lateral reasoning of a human threat actor.
VANGUARD is an open-source, mathematically grounded framework that introduces the Cognitive Purple Agent. Engineered for PhD Researchers, Defense Contractors, and Enterprise SecOps, VANGUARD solves two critical paradigms in offensive AI:
-
The "Black Box" Validation Gap: VANGUARD doesn't just attack; it streams its kill-chain telemetry (
$t_{attack}$ ) to a local Elasticsearch/Kibana SIEM to compute its exact Time-to-Detect (TTD). When it discovers a 0.0% SOC alert gap, the agent reverses its ontology and synthesizes real KQL defensive rules to actively patch the SIEM. - The Agentic Alignment Problem: Unconstrained LLMs cannot be trusted with generic shell access without risking CI/CD system death. VANGUARD pioneers the FATAL_OS_BLOCKLIST, granting the agent total operational autonomy recursively (e.g., dynamically resolving apt/brew target dependencies) while mathematically sandboxing destructive regex patterns.
VANGUARD operates a tripartite closed-loop interaction mapped over an asynchronous Server-Sent Events (SSE) stream, providing total cryptographic transparency into the AI's "brain."
Figure 2: End-to-end autonomous pipeline - from LLM cognition loop through sandboxed execution, target exploitation, and dynamic SOC rule synthesis.
Conventional LLMs operate opaquely. VANGUARD utilizes an asynchronous web-stream (SSE) allowing human operators to cryptographically observe the agent's real-time state transitions (๐ง Cognitive Reason โ โก Tool Executed โ ๐ค Observation) through a custom, Palantir-inspired UI.
VanGuard.mp4
Figure 3: Full VANGUARD application walkthrough - SSE live-stream of agent state transitions across the Palantir-inspired UI.
The framework does not merely highlight vulnerabilitiesโit acts as an autonomous DefSecOps engineer. Following a successful simulated breach, the LLM systematically structures its un-logged attack vectors into Elasticsearch KQL Heuristics and autonomously deploys them to the SIEM (vanguard-rules). Defensive parity natively scales with offensive automation.
Figure 4: Autonomous DefSecOps loop - LLM-structured attack vectors synthesized into Elasticsearch KQL heuristics and autonomously deployed to the vanguard-rules SIEM index.
VANGUARD ships with a standalone suite of vulnerable enterprise targets to validate Zero-Shot exploitation logic:
targets/cloud_storage.py: Advanced IDOR, JSON Web Token (JWT) signature stripping viaalg: none, and PDF Conversion Command Injection.targets/vulnerable_app.py: Generic corporative monolithic APIs leaking LFI and RCE vectors via Base64 serialization.targets/legacy_erp.py: Emulation of unpatched, critical internal architecture.
Figure 5: Real-time Attack Chain Visualization - the LLM's exploitation path rendered as a live directed graph across multi-vertical targets.
For Defense Contractors and Academic Research Groups (PhD), VANGUARD serves as the foundational architecture for the next decade of Cyber Warfare capabilities:
- Multi-Agent Wargaming (Swarm Logic): Evolving from a single Purple node to a distributed swarm. "Red" LLM agents coordinating lateral movement across diverse VPC segments, while an entirely separate "Blue" LLM dynamically rewrites YARA/Zeek rules in real-time to intercept them.
- Reinforcement Learning from Human Feedback (RLHF): Training proprietary defense-sector weights by having human elite Red Teamers grade the efficacy and stealth of VANGUARD's generated payloads.
- Air-Gapped Operationalization: VANGUARD is purposely engineered to thrive entirely off-grid. By leveraging quantized edge-models (
Qwen 3 8B) and completely cutting reliance on OpenAI/Anthropic APIs, the framework is mathematically cleared for deployment within partitioned hyper-secure enclaves.
- macOS/Linux (Tested on Ubuntu 22.04 & macOS Sonoma)
- Python 3.10+
- Ollama installed locally (Required models:
qwen3:8borllama3) - Docker (For Elasticsearch/Kibana integration)
The repository includes an aggressive bootstrap script to stand up the frontend UI, the FastAPI backend, and the initial SQLite databases organically.
git clone https://github.com/usualdork/VANGUARD.git
cd VANGUARD
chmod +x run_demo.sh
./run_demo.shTo enable the mathematically verifiable Gap Analysis and TTD visualization, spin up the local SIEM data pipeline:
# Start Elastic Stack in an isolated network
docker-compose up -d
# Push Vanguard Index Data Views directly to Kibana
python setup_kibana_dashboard.py- Access the VANGUARD Dashboard at
http://localhost:8080 - Start an enterprise target in an adjacent terminal:
python targets/cloud_storage.py(Binds to9997) - Supply the Target URL into the UI Simulation pane and execute INITIALIZE RUN.
- Observe the real-time AI Kill Chain generation and navigate to the SOC Rules tab to review the autonomously patched heuristics.
If you utilize VANGUARD or the FATAL_OS_BLOCKLIST methodology in your defense systems or academic research, please cite our preprint:
@article{tripathy2026vanguard,
title={VANGUARD: A Cognitive Purple Agent Framework for Autonomous Adversarial Simulation and Real-Time SIEM Validation},
author={Tripathy, Manish},
year={2026},
publisher={Zenodo},
doi={10.5281/zenodo.18846075},
url={https://doi.org/10.5281/zenodo.18846075}
}Distributed under the Apache 2.0 License. See LICENSE for more information.
WARNING: This framework utilizes live exploitation methodologies. Do not point VANGUARD at domains or IP addresses you do not explicitly own or have authorization to audit. The authors assume no liability for misuse.
