Skip to content

Project Settings Reference

Samuele Giampieri edited this page May 9, 2026 · 29 revisions

Project Settings Reference

Every project in RedAmon has 245+ configurable parameters that control the behavior of each reconnaissance module, the AI agent, and CypherFix automated remediation. These settings are managed through the project form UI (16 tabs across four groups: Scope, Recon Pipeline, AI Agent, Remediation), stored in PostgreSQL, and fetched by the recon container and agent at runtime.

Project Form Tabs

Defaults: Sensible defaults are loaded automatically from the server when creating a new project. You only need to fill in the required fields (project name and target domain — or target IPs in IP mode) and adjust what you want.

Recon Presets: Instead of configuring the 215+ parameters below individually, you can apply a Recon Preset that sets all recon parameters at once. See Recon Presets for the full list of 21 built-in presets and how to create your own.


Table of Contents


Target Configuration

Parameter Default Description
Start from IP (IP Mode) false Toggle between domain mode and IP/CIDR targeting mode. Locked after project creation. When enabled, hides domain fields and shows IP/CIDR input
Target Domain The root domain to assess (required in domain mode, hidden in IP mode)
Target IPs / CIDRs [] IP addresses and CIDR ranges to scan (IP mode only). Accepts IPv4, IPv6, and CIDR notation up to /24 (256 hosts)
Subdomain List [] Specific subdomain prefixes to scan (empty = discover all). Domain mode only
Verify Domain Ownership false Require DNS TXT record proof before scanning. Domain mode only
Ownership Token (auto) Unique token for TXT record verification
Ownership TXT Prefix _redamon DNS record name prefix
Stealth Mode false Forces passive-only techniques — disables active scanning, brute force, and GVM
Use Tor false Route all recon traffic through the Tor network
Use Bruteforce true Enable Knockpy active subdomain bruteforcing. Domain mode only

Scan Module Toggles

Modules can be individually enabled/disabled with automatic dependency resolution — disabling a parent module automatically disables all children:

domain_discovery (root)
  └── port_scan
       └── http_probe
            ├── resource_enum
            └── vuln_scan
Parameter Default Description
Scan Modules all enabled Array of phases to execute
Update Graph DB true Auto-import results into Neo4j
WHOIS Max Retries 3 Retry attempts for WHOIS lookups
DNS Max Retries 3 Retry attempts for DNS resolution

Port Scanner (Masscan)

High-speed SYN port scanner optimized for large networks and IP/CIDR ranges. Runs in parallel with Naabu — results are merged and deduplicated automatically. Incompatible with Tor (raw SYN packets bypass TCP stack). Both scanners are enabled by default.

Graph nodes — consumes: IP, Domain | produces: Port, Service

Parameter Default Description
Enabled true Toggle Masscan on/off
Top Ports 1000 Port selection: 100, 1000, or "full" for all 65535
Custom Ports Manual port range (e.g., 80,443,8080-8090). Overrides Top Ports
Rate 1000 Packets per second. Masscan handles very high rates (10k+)
Banners false Capture service banners (SSH, HTTP, etc.). Increases scan time
Wait 10 Seconds to wait for late responses after scan completes
Retries 1 Retry attempts for unresponsive ports
Exclude Targets Comma-separated IPs/CIDRs to exclude from scanning

Warning: If both Masscan and Naabu are disabled, port scanning is skipped entirely and downstream modules (HTTP probe, vulnerability scanning) will produce no results.

How it works

Masscan is a stateless asynchronous SYN scanner — instead of completing TCP handshakes, it crafts and sends raw SYN packets directly via a custom user-space TCP/IP stack and listens promiscuously for SYN-ACK replies. The custom stack is what makes it fast (the OS kernel TCP stack is the throughput ceiling for normal scanners) and what makes it Tor-incompatible (raw packets ignore the SOCKS proxy entirely).

The module first calls resolve_targets_to_ips(recon_data) to extract every IPv4 from the in-progress recon graph (subdomain A records, raw IPs, anything with a numeric address), filtering out non-routable space. Hostnames are not passed to masscan — it only scans IPs — but a ip_to_hostnames map is preserved so the parser can stamp the original hostname back onto each discovered Port for the graph merge.

build_masscan_command then assembles the CLI: target IPs are written to a file (-iL), the port spec comes from either Top Ports (translated to a literal range) or Custom Ports verbatim, the --rate <pps> flag controls packets-per-second pacing, --wait <seconds> controls how long masscan listens for late SYN-ACKs after the scan ends (default 10 — too low loses slow responders, too high wastes time), and an optional --excludefile is generated from the Exclude Targets list and IP-filter blocklist.

The subprocess is launched via subprocess.Popen with stdout/stderr captured. Masscan writes its own NDJSON to disk (one record per port-open event), which parse_masscan_output then ingests line-by-line. Each record gets matched back to its hostnames via ip_to_hostnames and emitted as a Port + Service node tuple ready for graph MERGE.

Permission requirement: masscan needs CAP_NET_RAW to craft raw packets. The recon image already grants this capability on container start, so no additional setup is needed inside the pipeline. If you ever see Permission denied — masscan requires root or CAP_NET_RAW, the container's capabilities have been stripped — check the docker-compose security_opt section.

When to use Masscan over Naabu: masscan is the right choice for IP/CIDR sweeps (e.g. 192.0.2.0/24, 198.51.100.0/16) where the target set is millions of IPs. Naabu's scan loop is per-host and starts to slow down past tens of thousands of targets. For typical web-app pentest scope (single apex, dozens of subdomains, tens of IPs), Naabu and Masscan finish in seconds either way — running both is essentially free and gives you cross-validation of open-port findings.


Port Scanner (Naabu)

Controls how ports are discovered on target hosts.

Graph nodes — consumes: IP, Domain | produces: Port, Service

Parameter Default Description
Top Ports 1000 Port selection: 100, 1000, or custom
Custom Ports Manual port range (e.g., 80,443,8080-8090)
Scan Type SYN SYN (fast, requires root) or CONNECT (slower, no root needed)
Rate Limit 1000 Packets per second
Threads 25 Parallel scanning threads
Timeout 10000 Per-port timeout in milliseconds
Retries 3 Retry attempts for unresponsive ports
Exclude CDN true Skip CDN-hosted IPs (Cloudflare, Akamai, etc.)
Display CDN true Show CDN info but don't scan deeper
Skip Host Discovery false Skip ping-based host check
Verify Ports false Double-check ports with TCP handshake
Passive Mode false Use Shodan InternetDB instead of active scanning (zero packets)

How it works

Naabu is a Go-based stateless port scanner from ProjectDiscovery, run inside a Docker container (projectdiscovery/naabu:latest). Unlike masscan, it accepts hostnames directly and does its own DNS resolution before scanning, which means it can probe both example.com:443 and 198.51.100.42:443 from the same target file.

The module starts by checking that the Docker daemon is reachable (is_docker_installed, is_docker_running) and pulling the configured image if missing. extract_targets_from_recon then walks the in-progress graph and pulls every IP, Subdomain, and apex Domain into a target file — three sets, written one entry per line.

SYN with auto-fallback to CONNECT: when Scan Type is s (SYN), naabu needs raw-socket privileges. The module runs the scan once with -scan-type s; if the container exits with a permission error or returns no results, it automatically retries with -scan-type c (CONNECT scan via the kernel TCP stack). This fallback handles environments where CAP_NET_RAW was stripped without forcing the user to manually flip the toggle. The actual scan type used is logged so you know which one produced the results.

build_naabu_command assembles the full Docker invocation: image name, mount points for input/output files, naabu CLI flags. Notable flags:

  • -tp <N> for Top Ports (100/1000/full) or -p <range> for Custom Ports
  • -rate <N> for packets/sec ceiling
  • -c <N> for concurrent scanning threads
  • -timeout <ms> per-port timeout
  • -retries <N> retry budget per port
  • -Pn when Skip Host Discovery is on (skips the ping/ARP check, recommended for web-host targets that block ICMP)
  • -verify when Verify Ports is on (does a follow-up TCP handshake on every claimed-open port to weed out false positives — adds 10-20% scan time but eliminates the SYN-ACK reflection misclassification class)
  • -exclude-cdn and -display-cdn toggle CDN-IP behaviour: by default naabu skips deep scanning of CDN-fronted IPs (since you'd just be scanning Cloudflare's datacenters) but still records that the IP belongs to a CDN

Passive Mode: when on, naabu is bypassed entirely and the module instead queries https://internetdb.shodan.io/{ip} for each IP — Shodan's free, key-less InternetDB endpoint that returns the ports it has cataloged for that IP. Zero packets are sent to the target. Useful for stealth scans, pre-engagement reconnaissance, and any case where you want a port snapshot without burning rate-limit budget on the target's network.

Tor / proxychains: when Tor is enabled, naabu is run with the -proxy <socks5> flag pointing at the bundled Tor SOCKS5 proxy. Note that Tor's bandwidth ceiling makes Naabu over Tor extremely slow — 1000 ports × 50 hosts can take an hour. SYN scans are not supported over Tor (they bypass SOCKS); the module forces CONNECT mode under Tor.

parse_naabu_output ingests the JSON-line output and emits (host, ip, port, service) tuples. The service guess comes from naabu's IANA service-name table baked into the binary — these are advisory and get refined by httpx's tech detection in the next phase.

File ownership fix: naabu runs as root inside the container, so the output files it writes are owned by root on the host. fix_file_ownership chowns them back to the calling user before the parser reads them — without this you'd hit permission errors on subsequent reads.


Nmap Service Detection

Deep service version detection (-sV) and NSE vulnerability scripts (--script vuln) on discovered open ports. Runs after the port-scan merge, only probing ports already confirmed open by Masscan/Naabu. Detected service versions feed into the CVE lookup pipeline for NVD/Vulners enrichment.

Graph nodes -- consumes: IP, Port | produces: Port (enriched), Service (enriched), Technology, Vulnerability, CVE

Parameter Default Description
Enabled true Toggle Nmap service detection on/off
Version Detection (-sV) true Probe open ports for service/version info
NSE Vulnerability Scripts true Run --script vuln for vulnerability detection
Timing Template T3 Nmap timing template: T1 (Sneaky), T2 (Polite), T3 (Normal), T4 (Aggressive), T5 (Insane)
Total Timeout 600 Maximum total scan duration in seconds
Per-Host Timeout 300 Maximum time per target host in seconds
Parallelism 2 Number of IPs to scan concurrently. Higher values speed up scanning but increase network load (1-10)

Stealth mode overrides: timing forced to T2 (Polite), NSE scripts disabled.

How it works

Unlike Naabu/Masscan which discover open ports, nmap is run only against the open-port set already discovered — the module never re-scans the full port range. build_nmap_targets walks the merged port-scan output, builds a per-IP map of {ip: [open_ports]}, and emits one nmap invocation per IP with a comma-separated -p <ports> argument. This is the single biggest reason Nmap is fast in this pipeline despite being slower per-port than Naabu/Masscan: it's only ever asked to probe ~5-50 ports per host, never 65k.

build_nmap_command assembles each per-IP command:

  • -sV for version detection — sends carefully-chosen probes to each open port (a bank of fingerprints in nmap's nmap-service-probes file, ~12,000 probe/match patterns) and matches responses against known service signatures. Returns the exact software + version string (Apache 2.4.49, OpenSSH 8.2p1, MySQL 5.7.34) which is the input the CVE Lookup pipeline needs for accurate matching against NVD/Vulners.
  • --script vuln for NSE vulnerability scripts — runs the entire vuln script category (~150 scripts) against open services. These scripts are written in Lua and target specific CVEs (Heartbleed, Shellshock, MS17-010, smb-vuln-*, http-shellshock, ssl-poodle, etc.). Each match emits a structured Vulnerability record with CVE ID, severity, and the script's confidence verdict.
  • -T<N> timing template — controls inter-packet delay, parallelism, retries, and timeout values. T3 (Normal) is the default. T1 Sneaky drops packet rate to 1/15s for stealth. T5 Insane saturates the link and is rarely useful (services start dropping packets).
  • --host-timeout <sec> — Per-Host Timeout. Caps the time spent on any single IP to prevent a slow/blackholed host from dragging the whole scan.
  • -oX <path> writes XML output, which is what the parser consumes.

Concurrent scanning: instead of one big nmap invocation against all IPs (which would be a single slow process), the module runs Parallelism separate nmap subprocesses concurrently — one per IP. Each subprocess has its own per-host timeout. This is significantly faster for multi-host scans because nmap's own --max-parallelism is per-process and doesn't leverage multiple cores well.

XML parsing: parse_nmap_xml reads the per-IP XML output via xml.etree.ElementTree. Three things get extracted:

  1. Service version data for each port (<service> elements with name, product, version, extrainfo) — feeds Port and Service node enrichment plus a Technology node when the service is a recognized stack (apache, nginx, openssh, etc.).
  2. NSE vuln output (<script id="..." output="..."> elements where the id starts with one of the known vuln-script prefixes) — parsed into per-finding Vulnerability records with CVE IDs extracted via regex from the script output (CVE-\d{4}-\d{4,7} pattern).
  3. Nmap version string for telemetry (which version of the scanner produced these results — useful when reproducing findings months later).

Stealth mode behavior: when stealth is enabled at the project level, the timing template is forced to T2 (Polite — adds 0.4s delay between probes, drops parallelism to 1) and NSE scripts are disabled (the vuln script category sends loud probes that defeat stealth). Service version detection (-sV) stays on because version probes are necessary for CVE matching and modern WAFs typically don't alert on standard -sV traffic.

Why the pipeline runs Nmap after Naabu/Masscan rather than instead: Naabu/Masscan are 10-100x faster at the port-discovery step but their service detection is shallow (just IANA port-number guesses). Nmap is the gold standard for version detection but slow at port discovery. Splitting the work — fast scanners do the discovery, nmap does the deep probing only on confirmed-open ports — gets the best of both.


HTTP Prober (httpx)

Controls what metadata is extracted from live HTTP services.

Graph nodes — consumes: Domain, IP, Port, Service | produces: BaseURL, Certificate, Technology, Header, Service, Port

How it works

httpx is the bridge between port discovery and web-layer scanning. It takes the open-port set from Naabu/Masscan plus the subdomain set from Domain Discovery and decides which of those endpoints actually speak HTTP — then it harvests every piece of metadata it can in a single request per target.

Target construction: build_targets_from_naabu walks the merged port-scan output and emits one URL per (host, port) pair. The scheme is auto-selected — port 443/8443 and any service whose IANA name contains https/ssl/tls get https://, everything else gets http://. build_targets_from_dns adds a fallback layer: any Subdomain that doesn't have an open port from the scanners gets probed on the standard ports (80, 443) anyway, since some hosts only respond to web traffic and were silent during port scanning.

Docker invocation: httpx runs as projectdiscovery/httpx:latest inside Docker. The Docker command is built dynamically from the project's probe toggles — every individual flag (-status-code, -content-length, -title, -server, -tech-detect, -tls-grab, -jarm, -favicon, -asn, -cdn, etc.) maps 1:1 to a probe toggle in the form. Probes you don't need are simply not requested, which makes the per-target round-trip faster.

Probe internals:

  • -tls-grab performs a full TLS handshake and serializes the certificate chain (subject CN, SAN list, issuer, validity range, signature algorithm). The full cert is stored on the BaseURL as a Certificate node — used downstream by Subdomain Takeover detection (cert fingerprint lookups) and Security Checks (expiry-soon detection).
  • -jarm computes the JARM TLS fingerprint — a hash that identifies the server's TLS stack quirks. Useful for clustering hosts (CDN-fronted vs origin-direct, same product family, malware C2 signatures).
  • -favicon computes mmh3 hash of the favicon — fingerprint for finding identical hosts in Shodan/Censys/uncover via favicon hash search (a classic way to find dev/staging copies of a target).
  • -tech-detect runs httpx's built-in lightweight tech detection (header, body, and HTML-element pattern matching).
  • -asn and -cdn annotate the response with autonomous system + CDN provider (extracted from IP allocation databases) — feeds the Exclude CDN logic in port scanning.

Banner grabbing for non-HTTP ports: separately, run_banner_grab takes every port that's not in the HTTP-port allowlist (i.e. SSH, FTP, MySQL, Redis, etc.) and opens a raw socket with a configurable timeout, reads up to Max Length bytes, and runs identify_service on the banner — pattern-matching against known service signatures (e.g. SSH-2.0-OpenSSH_8.2p1openssh 8.2p1). This fills the gap where naabu/masscan only know IANA port numbers and httpx only handles HTTP.

Wappalyzer second pass: when Wappalyzer is enabled, every HTML response captured by httpx is run through a second-pass technology fingerprinter (~6,000 fingerprints from the open-source Wappalyzer dataset). This catches stack components httpx's built-in detection misses (specific JS libraries, CMS plugins, analytics frameworks). Results merge into the same Technology nodes — duplicates are MERGE-deduplicated.

Why httpx runs once per host: httpx is built around the principle "extract everything possible in one request." Other tools like wappalyzer-cli or sslyze would each make their own connection — running them against thousands of hosts doubles or triples the round-trip count. httpx's flag-driven probe model means one connection, all metadata.

Connection Settings:

Parameter Default Description
Threads 50 Concurrent HTTP probes
Timeout 15 Request timeout (seconds)
Retries 0 Retry attempts for failed requests
Rate Limit 150 Requests per second
Follow Redirects true Follow HTTP redirects
Max Redirects 10 Maximum redirect chain depth

Probe Toggles (each individually enabled/disabled):

Probe Default Description
Status Code true HTTP response status code
Content Length true Response body size
Content Type true MIME type of response
Title true HTML page title
Server true Server header value
Response Time true Time to first byte
Word Count true Number of words in response
Line Count true Number of lines in response
Tech Detect true Built-in technology fingerprinting
IP true Resolved IP address
CNAME true CNAME DNS records
TLS Info true TLS certificate details
TLS Grab true Full TLS handshake data
Favicon false Favicon hash (for fingerprinting)
JARM false JARM TLS fingerprint
ASN true Autonomous System Number
CDN true CDN provider detection
Response Hash Hash algorithm for response body
Include Response false Include full response body
Include Response Headers false Include all response headers

Filtering:

Parameter Default Description
Paths [] Additional paths to probe on each host
Custom Headers [] Extra headers to send with requests
Match Codes [] Only keep responses with these status codes
Filter Codes [] Exclude responses with these status codes

Technology Detection (Wappalyzer)

Second-pass technology fingerprinting engine with 6,000+ fingerprints.

Parameter Default Description
Enabled true Master toggle for Wappalyzer
Min Confidence 50 Minimum detection confidence (0-100%)
Require HTML false Only fingerprint responses with HTML content
Auto Update true Update fingerprint database from npm
NPM Version 6.10.56 Wappalyzer npm package version
Cache TTL (hours) 24 How long to cache fingerprint data

Banner Grabbing

Raw socket banner extraction for non-HTTP services.

Parameter Default Description
Enabled true Master toggle for banner grabbing
Timeout 5 Connection timeout (seconds)
Threads 10 Concurrent banner grab connections
Max Length 1024 Maximum banner size (bytes)

Web Crawler (Katana)

Active web crawling for endpoint and parameter discovery.

Graph nodes — consumes: BaseURL | produces: Endpoint, Parameter, BaseURL

How it works

Katana is the primary discovery engine in the resource-enumeration phase. It runs as projectdiscovery/katana:latest inside a Docker container, takes every BaseURL produced by httpx as input, and crawls each one to extract URLs, query parameters, form action targets, and JavaScript-embedded endpoints.

Two crawl modes:

  • Standard (HTML parser) — parses HTML statically with goquery, follows <a href>, <form action>, <script src>, <link href>, and a few less-common element references. Fast, lightweight, no browser overhead. Misses anything that requires JavaScript execution to render.
  • Headless mode (-jc) — spins up a headless Chromium inside the container, loads each page, waits for JS execution, and crawls the rendered DOM. Resolves SPA routes (React Router, Vue Router, Next.js dynamic routes), client-side fetch()/axios calls, and dynamically-injected forms. Significantly slower (5-10x) and uses more memory, but catches modern apps where the standard parser would just see a <div id="root"></div> shell.

Scope control: the --scope flag is set to the comma-separated list of allowed apex domains (the project apex plus any explicitly added scope domains). This stops katana from wandering off into third-party CDN URLs and Google Analytics endpoints. Anything outside scope still gets recorded — but as ExternalDomain nodes for situational awareness, never crawled deeper.

Form parsing: when katana hits a <form> element, the parse_forms_from_html helper extracts the action URL, method, and every named input field. Each input becomes a Parameter node attached to the form's endpoint. This is one of the highest-signal sources of parameter data in the whole pipeline because form fields are guaranteed to be backend-recognized (vs Arjun's brute-forced names which might or might not be read).

Custom headers: when set, headers are passed to katana via -H flags — used for authenticated crawls (Cookie/Bearer/Basic), specific User-Agent overrides, or X-Forwarded-For-style header injection for testing trusted-header bugs.

Output streaming: katana writes JSON-line output to stdout. The module reads it line-by-line so progress is visible in the recon log even on long-running crawls. Each line is parsed into a (url, method, source, fields) tuple and merged into the graph.

Why headless mode is off by default: a single React/Vue app load can take 2-5 seconds (DOM parse + JS execution + idle wait). Multiply that by a few hundred BaseURLs and you've added 30+ minutes to the scan. For most targets the standard parser catches >80% of the URLs anyway because most modern frameworks ship a static manifest of routes that the HTML parser can find. Turn headless on when you know the target is a JS-heavy SPA where the static parser is finding suspiciously few endpoints.

Parameter Default Description
Enable Katana true Master toggle for active web crawling
Crawl Depth 2 How many links deep to follow (1-10). Each level adds ~50% time
Max URLs 300 Maximum URLs to collect per domain. 300: ~1-2 min/domain, 1000+: scales linearly
Rate Limit 50 Requests per second
Timeout 3600 Overall crawl timeout in seconds (default: 60 minutes)
JavaScript Crawling false Parse JS files with headless browser (+50-100% time)
Parameters Only false Only keep URLs with query parameters for DAST fuzzing
Exclude Patterns [100+ patterns] URL patterns to skip — static assets, images, CDN URLs
Custom Headers [] Browser-like headers to avoid detection
Parallelism 5 Number of target URLs to crawl simultaneously via -p flag (1-50)
Concurrency 10 Concurrent HTTP fetchers per target URL via -c flag (1-50)

Passive URL Discovery (GAU)

Passive URL discovery from web archives and threat intelligence sources.

Graph nodes — consumes: Domain, Subdomain | produces: Endpoint, Parameter, BaseURL

How it works

GAU (GetAllUrls) queries four archive sources in parallel for every URL ever recorded against the apex + each Subdomain. The four sources have different blind spots, so combining them produces broader coverage than any one alone:

Provider Source Strength Blind spot
wayback Internet Archive Wayback Machine Historical depth — URLs that existed in 2010 are still here Recent URLs (<24h) often not yet crawled
commoncrawl Common Crawl public dataset Web-scale breadth Targeted URLs may be missed if not crawled
otx AlienVault OTX URL list Threat-intelligence-flagged URLs Limited to URLs in security feeds
urlscan URLScan.io scan history URLs that were submitted for scanning (often by researchers/operators) Auto-disabled when the URLScan module already ran (avoids double-counting)

run_gau_for_domain runs gau as a Docker container per (provider × domain) combination — all combinations are launched concurrently via a worker pool sized by Workers. Results are streamed back, deduplicated against the in-memory URL set, and capped at Max URLs per domain. The Year Range filter (Wayback only) is applied as a --from / --to flag pair on the gau command — useful when you want to exclude very old URLs whose paths don't exist anymore.

Blacklist extensions filter out static-asset URLs (.png, .jpg, .css, .pdf, .zip, .woff, etc.) that pollute results without being attack surface. The default list covers ~30 extensions; you can extend or shrink it per-project.

URL Verification (optional second pass): when on, every archived URL is re-checked with a HEAD request against the live target. This:

  1. Strips dead URLs — a URL recorded by Wayback in 2018 might 404 today; without verification it would pollute the graph and waste downstream scanner time.
  2. Captures live status code — useful for differentiating live endpoints from those that consistently 401/403 (still attack surface) vs 404 (gone).
  3. Optionally detects HTTP methods — when Detect Methods is on, an additional OPTIONS probe per URL extracts the Allow: header listing supported methods (POST/PUT/DELETE/PATCH). This produces full per-method endpoint coverage without needing to actually send those methods.

The verification pass is rate-limited and threaded separately from the discovery pass (since it's actually touching the target — turning a passive scan into a low-volume active scan).

merge_gau_into_by_base_url maps each discovered URL back to its Subdomain/host, adds it to the in-progress by_base_url map (the canonical structure that downstream modules read), and emits Endpoint + Parameter + BaseURL nodes for the graph. Duplicates against existing Katana/Hakrawler/JsLuice findings are folded silently.

Why GAU complements Katana rather than replacing it: Katana finds URLs the target currently links to; GAU finds URLs the target ever exposed. Many production apps have abandoned endpoints (/api/v1/legacy/..., /admin/old-panel/, debug routes from previous deployments) that aren't in any sitemap or HTML response today but still respond when hit directly. These are some of the highest-yield findings in a typical engagement, and only GAU surfaces them.

Parameter Default Description
Enable GAU false Master toggle for passive URL discovery
Providers wayback, commoncrawl, otx, urlscan Data sources for archived URLs
Max URLs 1000 Maximum URLs per domain (0 = unlimited)
Timeout 60 Request timeout per provider (seconds)
Threads 5 Parallel fetch threads (1-20)
Year Range [] Filter Wayback by year (e.g., "2020, 2024"). Empty = all
Verbose Output false Detailed logging
Blacklist Extensions [png, jpg, css, pdf, zip, ...] File extensions to exclude
Workers 10 Parallel domain query workers (replaces hardcoded limit of 5) (1-20)

URL Verification (when enabled, GAU confirms URLs are still live):

Parameter Default Description
Verify URLs false HTTP check on archived URLs
Verify Timeout 5 Seconds per URL check
Verify Rate Limit 100 Verification requests per second
Verify Threads 50 Concurrent verification threads (1-100)
Accept Status Codes [200, 201, 301, ...] Status codes indicating a live URL
Filter Dead Endpoints true Exclude 404/500/timeout URLs

HTTP Method Detection (when verification is enabled):

Parameter Default Description
Detect Methods false Send OPTIONS to discover allowed methods
Method Detect Timeout 5 Seconds per OPTIONS request
Method Detect Rate Limit 50 Requests per second
Method Detect Threads 25 Concurrent threads

ParamSpider Passive Parameter Discovery

ParamSpider discovers URL parameters from the Wayback Machine archives. It queries web.archive.org for historical URLs containing query parameters, providing passive parameter discovery without sending any requests to the target. Disabled by default.

Graph nodes - consumes: Domain, Subdomain | produces: Endpoint, Parameter

How it works

ParamSpider queries the Wayback Machine's CDX index API (http://web.archive.org/cdx/search/cdx?url=*.<domain>/*&output=text&fl=original&collapse=urlkey) for every snapshot URL ever recorded for the apex domain plus each discovered Subdomain. The CDX API returns one URL per line — fast and structured.

The crucial filter is what comes next: every returned URL is parsed and only URLs containing a query string (?key=value) are kept. Everything else (static pages, asset URLs, parameterless API routes) is dropped. Where GAU returns all historical URLs, ParamSpider returns only the parameterized ones — the surface area you actually need for SQLi/XSS/SSRF/IDOR fuzzing.

For each surviving URL, ParamSpider replaces every parameter value with the string FUZZ (or whatever placeholder is configured). So https://example.com/api/users?id=42&debug=true becomes https://example.com/api/users?id=FUZZ&debug=FUZZ. This makes the output drop-in compatible with downstream fuzzers — Nuclei DAST, FFuf, sqlmap, ffuf wordlists, etc. all consume FUZZ-templated URLs natively.

The Workers setting controls how many domains are queried concurrently. Each worker is a separate request to web.archive.org's CDX API — increasing this past ~10 hits the archive's per-IP rate limits, which return slow/empty responses rather than HTTP 429.

Why this is so effective in practice: developers tend to rotate parameters less than they rotate URLs. A debug parameter (?debug=1, ?test=true, ?admin=1) added once during development frequently survives across years of refactoring even when the URL it was on gets renamed. ParamSpider catches these "ghost params" by mining the historical record.

Domain mode only: ParamSpider is skipped in IP mode because the Wayback Machine indexes by hostname, not by IP. There's no useful CDX query for 198.51.100.42.

Parameter Default Description
Enable ParamSpider false Master toggle for passive parameter discovery
Placeholder FUZZ Placeholder string injected into parameter values for downstream fuzzing
Timeout 120 Overall timeout in seconds
Workers 5 Parallel domain workers for Wayback Machine queries (1-10)

API Discovery (Kiterunner)

API endpoint brute-forcing using real-world Swagger/OpenAPI wordlists.

Graph nodes — consumes: BaseURL | produces: Endpoint, BaseURL

How it works

Kiterunner is Assetnote's purpose-built API route brute-forcer, run as a native Go binary (auto-downloaded on first use from the Assetnote GitHub releases). The crucial difference from generic content discovery (FFuf) is the wordlist source: kiterunner ships with .kite-format wordlists derived from tens of thousands of real-world Swagger/OpenAPI specifications scraped from public APIs. So instead of testing /admin, /login, /test from common.txt, it tests routes humans actually deploy: /api/v1/users/{id}/profile, /api/v2/orders/search, /v1/auth/refresh-token, /api/admin/users/{userId}/permissions.

Binary + wordlist provisioning (ensure_kiterunner_binary): on first scan, the helper detects the host architecture (linux-amd64 / linux-arm64 / macos-amd64 / macos-arm64), downloads the corresponding kiterunner release tarball from the Assetnote GitHub release, and unpacks the binary into tools/kiterunner/. The chosen wordlist (routes-large.kite ~140k routes, or routes-small.kite ~20k routes) is downloaded from https://wordlists-cdn.assetnote.io/data/kiterunner/... and cached on disk. Subsequent scans reuse the cached binary and wordlist.

Method detection layer: where most fuzzers only test GET, kiterunner enumerates all HTTP methods on each found route. Two modes:

  • bruteforce (default) — for each route returned with a 2xx/3xx/4xx, sends additional POST, PUT, DELETE, PATCH requests and records which methods are accepted. Slower but reliable.
  • options — sends a single OPTIONS request per route and parses the Allow: response header. Faster but unreliable on misconfigured servers that don't honor OPTIONS.

This matters because write-side methods (POST/PUT/DELETE) are the highest-yield attack surface for IDOR, mass-assignment, and unauthenticated mutation bugs — and most discovery tools never test them.

Status code filtering (Match Status Codes / Ignore Status Codes): the default match list keeps [200, 201, 202, 204, 301, 302, 303, 307, 308, 401, 403]. Note that 401 and 403 are kept by default — they indicate the route exists but requires auth. From an attacker's perspective those are still attack surface (potential privesc, auth bypass, or auth-header injection targets), so the default policy is "find them all, filter at review time."

Authenticated scans: when Custom Headers are set (Authorization: Bearer <token>, Cookie: session=..., etc.), they're passed to kiterunner via -H flags and applied to every request. This is essential when scanning APIs behind auth — the routes a logged-in user sees are different from the public ones.

Concurrency: Connections controls per-target concurrent connections, Threads controls how many targets are scanned in parallel. The two are multiplicative — 100 connections × 50 threads means up to 5000 simultaneous requests at peak, throttled by Rate Limit (req/sec) at the server-output stage.

Why this is run after Katana/Hakrawler rather than instead: crawlers find routes the app links to. Kiterunner finds routes the app implements but doesn't expose. The two are complementary — each catches a different blind spot.

Parameter Default Description
Enable Kiterunner true Master toggle for API brute-forcing
Wordlist routes-large routes-large (~100k, 10-30 min) or routes-small (~20k, 5-10 min)
Rate Limit 100 Requests per second
Connections 100 Concurrent connections per target
Timeout 10 Per-request timeout (seconds)
Scan Timeout 1000 Overall scan timeout (seconds)
Threads 50 Parallel scanning threads
Min Content Length 0 Ignore responses smaller than this (bytes)
Parallelism 2 Number of wordlists to process in parallel (1-5)

Status Code Filters:

Parameter Default Description
Ignore Status Codes [] Blacklist: filter out noise (e.g., 404, 500)
Match Status Codes [200, 201, ...] Whitelist: only keep these codes. Includes 401/403
Custom Headers [] For authenticated API scanning

Method Detection:

Parameter Default Description
Detect Methods true Find POST/PUT/DELETE methods beyond GET
Detection Mode bruteforce bruteforce (slower, more accurate) or options (faster)
Bruteforce Methods POST, PUT, DELETE, PATCH Methods to try in bruteforce mode
Method Detect Timeout 5 Seconds per request
Method Detect Rate Limit 50 Requests per second
Method Detect Threads 25 Concurrent threads

Web Crawler (Hakrawler)

Hakrawler is a DOM-aware web crawler that runs as a Docker container alongside Katana. It provides an additional crawling perspective with scope-aware link following.

Graph nodes — consumes: BaseURL | produces: Endpoint, Parameter, BaseURL

How it works

Hakrawler (by hakluke) is a lightweight Go-based crawler run as jauderho/hakrawler:latest inside Docker. Its key differentiator from Katana is implementation simplicity: it's ~500 lines of Go using colly for HTTP and goquery for HTML parsing, no headless browser, no JS execution, just fast HTML link extraction. This means it's faster per page than Katana but less thorough on JS-heavy targets.

Stdin-based invocation: hakrawler reads its target list from stdin instead of a file, so the module pipes BaseURLs directly into the container via echo "..." | docker run -i .... This avoids the file-mount round-trip and makes per-URL invocation cleaner.

Per-URL parallel containers: instead of one big hakrawler invocation against all BaseURLs, the module spins up Parallelism separate Docker containers — each crawls one BaseURL — and runs them concurrently. This is faster than a single in-process hakrawler crawl because it lets Docker's kernel-level parallelism scale across CPU cores, and each container has its own goroutine pool. The trade-off is container-startup overhead per URL (small, ~200ms each).

Crawl Depth controls link-follow recursion: depth=2 (the default) means hakrawler crawls the start URL, then every URL discovered there, then stops. Higher depths exponentially increase crawl time and rarely find more attack surface than depth 2-3 on typical web apps.

Include Subdomains toggle: when on, hakrawler will follow links to subdomains of the apex (e.g. crawling app.example.com discovers a link to api.example.com and follows it). When off, it stays on the exact host of the start URL. Even when on, the discovered URLs are still scope-filtered before merge — anything outside the project apex set is recorded as ExternalDomain rather than crawled deeper.

Skip TLS Verify: the default is on because targets in scope frequently have self-signed or mis-issued certs (staging hosts, internal CAs). Without this flag, hakrawler would refuse to crawl those targets and you'd silently lose coverage.

Custom Headers: passed via -h flag — used identically to Katana for authenticated crawls.

Auto-disabled in stealth mode: the project-level stealth toggle forces hakrawler off entirely, leaving Katana as the sole active crawler. This halves the request count to the target during stealth scans without significantly cutting coverage.

Why run two crawlers in parallel: Katana and hakrawler use different parsers, different scope-following heuristics, and different priority orderings. On most real-world targets they each find routes the other misses. The merge cost is essentially zero (deduplicated against each other and against Katana's output before graph insert), so unless you're optimizing for stealth, running both is a clear net win.

Parameter Default Description
Enable Hakrawler true Master toggle for Hakrawler crawling
Docker Image jauderho/hakrawler:latest Docker image to use
Crawl Depth 2 How many links deep to follow (1-10)
Threads 5 Concurrent crawling threads
Per-URL Timeout 30 Timeout per URL in seconds
Max URLs 500 Maximum URLs to discover
Include Subdomains true Allow crawler to follow links to subdomains. Results are still scope-filtered
Skip TLS Verify true Skip TLS certificate verification
Custom Headers [] Custom HTTP headers for requests
Parallelism 4 Number of URLs to crawl in parallel Docker containers (1-10)

Stealth mode: Hakrawler is automatically disabled in stealth mode to reduce the active crawling footprint.


JavaScript Analysis (jsluice)

jsluice is a JavaScript analysis tool compiled into the recon container. It downloads JS files discovered by Katana/Hakrawler from the target and analyzes them to extract hidden URLs, API endpoints, and embedded secrets.

Graph nodes — consumes: BaseURL, Endpoint | produces: Endpoint, Parameter, BaseURL, Secret

How it works

jsluice is Bishop Fox's Go-based static-analysis tool, compiled directly into the recon container image (no Docker round-trip per invocation). It uses tree-sitter-javascript to parse JS into an Abstract Syntax Tree, then walks the AST looking for two pattern classes that regex-grep approaches miss:

  1. URL buildersfetch('/api/v1/' + userId), axios.get(\${BASE}/users/${id}`), template-literal patterns, string concatenation patterns. jsluice reconstructs the resulting URL by tracing variable assignments back through the AST. This catches API routes that are dynamically assembled at runtime, which a static regex like https?://[^"']+` would never see.
  2. Secret patterns — AWS access keys, GCP service-account JSON snippets, JWT secrets, generic high-entropy strings, hardcoded API tokens. Each match is annotated with a confidence score based on context (a string starting with AKIA next to aws_secret_access_key is high-confidence, a random 40-char base64 string in isolation is low).

Pipeline: the module first walks the in-progress graph to find every JS-file URL referenced by BaseURLs and Endpoints (anything ending in .js, .mjs, .tsx, or with application/javascript content type). Each URL is downloaded once via urllib.request (no Selenium/headless browser — just the raw script source). The downloads are concurrent up to Concurrency workers and capped at Max Files total.

For each downloaded file, jsluice is run as a subprocess with the file as input. Its stdout is JSON-line: each line is one finding (URL, parameter, or secret). The parser ingests these and merges them into the graph:

  • URLs become Endpoint + BaseURL nodes (if the URL is in-scope) or ExternalDomain nodes (if it's third-party).
  • Parameters become Parameter nodes attached to the discovered Endpoint.
  • Secrets become Secret nodes — these are flagged for the AI agent's review and surface in the Insights dashboard's secrets panel.

No additional crawling: the only network traffic from jsluice is the JS file downloads themselves. All parsing is local to the container. So a jsluice run against 50 JS files is 50 GET requests total — significantly quieter than Katana's depth-3 crawl which can be hundreds of requests per BaseURL.

Why pair jsluice with JS Recon (the deeper module): jsluice is the "fast pass" — it surfaces low-hanging URLs and obvious secrets in seconds. JS Recon (GROUP 5b in the pipeline) does the deeper work: live API validation against 21 services, source-map discovery, dependency-confusion checks, DOM-XSS sink detection, framework version fingerprinting. Jsluice runs in the resource-enumeration phase to catch URLs early so they feed into Nuclei DAST in the same scan; JS Recon runs after and goes deeper.

Parameter Default Description
Enable jsluice true Master toggle for JavaScript analysis
Max Files 50 Maximum number of JS files to analyze
Timeout 120 Overall analysis timeout in seconds
Concurrency 5 Files to process concurrently
Extract URLs true Extract URLs and API endpoints from JS
Extract Secrets true Detect API keys, tokens, and credentials
Parallelism 3 Number of base URLs to analyze in parallel (1-10)

Note: jsluice downloads JS files from the target (HTTP requests) and analyzes them locally. No additional crawling beyond fetching the JS files themselves.


JS Reconnaissance

JS Recon is a deep JavaScript analysis engine that runs after resource enumeration. It downloads discovered JS files and runs parallel analysis modules to extract secrets, endpoints, dependency confusion risks, source map exposures, DOM XSS sinks, and framework fingerprints. Disabled by default.

How it works

JS Recon is the deeper companion to jsluice. Where jsluice does fast pattern matching during resource enumeration, JS Recon waits until all JavaScript files in the graph have been collected (Katana, Hakrawler, jsluice, GAU/Wayback) and then runs seven independent analysis passes in parallel on the unified file set. Each pass is a separate Python module under recon/helpers/js_recon/.

File source aggregation: at run time, JS Recon walks the in-progress graph and collects every JS-flavored URL from BaseURLs and Endpoints. The Include toggles control which classes are pulled:

  • Include Webpack Chunks.chunk.js, .bundle.js, [hash].js files that Katana usually deprioritizes (they're hashed and treat as cache-busting noise) but actually contain the bulk of an SPA's logic
  • Include Framework JS — Next.js (/_next/static/chunks/), Nuxt.js (/_nuxt/), Astro (_astro/) framework chunks
  • Include Archived JS — historical JS URLs from GAU/Wayback that may contain secrets that have since been rotated but are still valid in practice

Files are downloaded via urllib.request with Concurrency workers, capped at Max Files total, and timed-boxed by the global timeout.

The seven analysis modules (each one its own helper module):

# Module What it does
1 Secret detection (patterns.py, validators.py) Scans every JS file with 90+ regex patterns covering AWS keys, GCP service accounts, GitHub PATs, Slack tokens, Stripe keys, JWT secrets, GitHub Apps, Twilio creds, Mailgun, SendGrid, npm tokens, Heroku, Datadog, etc. Then for any matched secret, runs live API validation against the real provider (https://sts.amazonaws.com/, https://api.github.com/user, etc.) — verified secrets get high confidence, regex-only matches get low. Custom user patterns can be uploaded per-project (additive to defaults)
2 Source-map discovery (sourcemap.py) Three independent strategies: parse //# sourceMappingURL=... comments at the file end; check the response SourceMap: and X-SourceMap: HTTP headers; probe common path patterns ({file}.map, {file.replace('.js', '.js.map')}, /{base}/{filename}.map, etc.). Custom probe paths uploadable per-project. Exposed source maps leak the entire pre-minification source tree
3 Dependency-confusion check (dependency.py) Extracts every import/require reference to scoped packages (@company/internal-utils) and queries https://registry.npmjs.org/<scope>/<name> to check if the public registry has an entry. If the scope is registered to a different owner, an attacker could publish a malicious version with a higher semver and supply-chain the build pipeline
4 Endpoint extraction (endpoints.py) Pattern-matches against extended endpoint signatures: REST routes (/api/v\d+/...), GraphQL endpoints (anything POST-only with graphql in the path), WebSocket URLs (ws://, wss://), router declarations ({ path: '/admin/...' } from React Router/Vue Router/Angular), admin/debug routes (/admin/..., /debug/..., /internal/...). User-uploaded custom keywords extend the default set
5 DOM-XSS sink detection Pattern-matches against 15 dangerous JS sinks: innerHTML, outerHTML, document.write, document.writeln, eval, Function(), setTimeout with string arg, setInterval with string arg, setAttribute('src'/'href', ...), __proto__, Object.assign(window, ...), dynamic import(), etc. The output flags the file + line number for manual review — these sinks aren't always bugs but they're always candidates
6 Framework detection (framework.py) 12 frameworks with version extraction — React, Vue, Angular, Svelte, Next.js, Nuxt.js, Gatsby, Ember, Backbone, Knockout, jQuery, MooTools. Version regexes are tuned per-framework (e.g. React's React.version = '17.0.2' vs Vue's Vue.version = '3.2.45'). Custom framework signatures uploadable as JSON
7 Dev-comment extraction Walks every comment in the JS files matching `TODO

Concurrency model: each of the seven modules runs in its own worker pool concurrently per file, so a 500-file scan with 10 concurrent files runs all seven analyses on each file in parallel. The total wall-clock time is bounded by the slowest module per file (usually the secret-validation pass when it has to make live API calls).

Output: every finding becomes a JsReconFinding node in the graph plus, when applicable, a typed Secret or Endpoint node. The Insights dashboard surfaces them in the JS-Recon panel; the AI agent reads them as context during exploit planning.

Graph nodes -- consumes: BaseURL, Endpoint | produces: JsReconFinding, Secret, Endpoint

Core Settings:

Parameter Default Description
Enable JS Recon false Master toggle for JS Reconnaissance (GROUP 5b)
Max Files 500 Maximum number of JS files to download and analyze
Concurrency 10 Concurrent file download threads
Timeout 900 Overall JS Recon timeout in seconds
Include Framework JS true Include framework-specific chunks (/_next/static/, /_nuxt/)
Include Chunks true Include .chunk.js and .bundle.js files
Include Archived JS true Include JS URLs from GAU/Wayback archive sources

Module Toggles:

Parameter Default Description
Secret Pattern Scanning true Scan JS files against 90+ regex patterns for credentials, tokens, and secrets
Source Map Discovery true Discover exposed .map files via comment parsing, HTTP headers, and path probing
Dependency Confusion Check true Check scoped npm packages against public registry for confusion risks
Endpoint Extraction true Extract REST, GraphQL, WebSocket, router, and admin/debug endpoints
DOM Sink Detection true Detect 15 DOM XSS sink patterns (innerHTML, eval, proto, etc.)
Framework Detection true Identify 12 frameworks with version extraction
Dev Comment Extraction true Extract TODO/FIXME/HACK comments with sensitive keywords

Secret Validation:

Parameter Default Description
Validate Discovered Keys true Test discovered secrets against their service APIs (21 services supported)
Validation Timeout 5 Per-validation request timeout in seconds
Minimum Confidence low Minimum confidence level to keep findings: low, medium, or high

JS File Sources:

Parameter Default Description
Include Webpack Chunks true Analyze .chunk.js and .bundle.js files excluded by Katana
Include Framework JS true Fetch Next.js (/_next/static/chunks/) and Nuxt.js (/_nuxt/) bundles
Include Archived JS true Analyze historical JS files from Wayback Machine/GAU

Custom Extensions (file uploads -- edit mode only):

Parameter Default Format Description
Custom Secret Patterns -- JSON array or TXT (name|regex|severity|confidence per line) Additional regex patterns. JSON schema: [{name, regex, severity?, confidence?}]
Custom Source Map Paths -- TXT (one URL template per line using {url}, {base}, {filename}) Extra paths to probe for .map files
Custom Internal Packages -- TXT (one @scope/name per line) Known internal npm package names to check against public registry
Custom Endpoint Keywords -- TXT (one keyword per line, min 2 chars) Extra keywords to search for in JS content
Custom Framework Signatures -- JSON array [{name, patterns[], version_regex}] Detection signatures for custom frameworks

All custom files have client-side validation before upload. Files are additive and do not replace built-in defaults.

Manual File Upload (edit mode only):

Parameter Default Description
Uploaded JS Files [] JS files for analysis without crawling -- from Burp Suite, mobile APKs, DevTools, or authenticated areas (.js, .mjs, .map, .json, max 10 MB each, multiple files supported)

Note: JS Recon is passive -- it downloads JS files already discovered by crawlers and analyzes them locally. Secret validation sends one minimally-scoped API request per secret with per-service rate limiting (1 req/sec). See JS Reconnaissance for full upload schemas, validation rules, and format examples.


Directory Fuzzer (FFuf)

How it works

FFuf (Fuzz Faster U Fool) is a Go-based directory + content brute-forcer baked into the recon container as a binary. The module runs it once per BaseURL with a generated wordlist + matcher/filter set, then ingests the JSON output for graph merge.

The discovery problem FFuf solves: crawlers (Katana, Hakrawler) can only find URLs the application links to. FFuf finds URLs the application responds to but doesn't link to — unlinked admin panels, .git / .env / .bak backups, dotfile leaks (.DS_Store, .svn/entries), debug consoles, undocumented API routes, common CMS paths (wp-admin/, phpmyadmin/), and version-control artifacts.

Wordlist source: the module ships a default wordlist (curated paths with the highest hit-rate per request) and supports per-project user-uploaded wordlists. Uploaded wordlists are stored under the project ID and mounted into the recon container at scan time, so you can load engagement-specific wordlists (e.g. tech-stack-specific lists for known PHP/Java/Node targets).

Smart Fuzzing toggle: when on, FFuf only fuzzes paths under directories already discovered by Katana/Hakrawler — instead of fuzzing the root /, it fuzzes /api/, /admin/, /static/, etc. that crawlers found. This dramatically cuts noise and request count: a typical web app might have 5-10 discovered directories, so 5,000-word fuzzing limits to those is 25k-50k requests instead of 5k requests against a single irrelevant root that returns 404 for every word.

Filter and matcher engines: ffuf supports filtering on response status (-fc), size (-fs), word count (-fw), line count (-fl), regex (-fr), and time (-ft). The flip side is matchers (-mc, -ms, -mw, -ml, -mr). Default project policy uses match-status filters — only keep responses with codes in the configured allowlist (typically 200,201,301,302,401,403). Tweak the size filter when the target returns a custom error page that's a fixed size (helps cut false positives).

Recursion: FFuf supports recursive fuzzing of discovered directories. The default depth is conservative (1-2 levels) because depth=3+ exponentially balloons the request count. Increase only when you have specific reason to believe deep nested paths exist.

Custom HTTP method, headers, and POST body: configurable for testing API endpoints that require POST/PUT/DELETE or auth. The headers field is the same place you set Authorization: / Cookie: for authenticated fuzzing.

Per-target rate limiting: the rate-limit setting is enforced via ffuf's built-in -rate flag (req/sec). Lower values keep the scan stealthy and avoid tripping WAFs that throttle at high rates.

FFuf (Fuzz Faster U Fool) brute-forces directory and endpoint paths using wordlists to discover hidden content that crawlers cannot find — admin panels, backup files, configuration pages, and undocumented APIs. Runs after jsluice and before Kiterunner in the pipeline. Disabled by default.

Graph nodes — consumes: BaseURL, Endpoint | produces: Endpoint, BaseURL

Parameter Default Description
Enable FFuf false Master toggle for directory fuzzing
Wordlist common.txt SecLists wordlist: common.txt, raft-medium-directories.txt, or directory-list-2.3-small.txt. Custom uploaded wordlists also appear here
Threads 40 Concurrent fuzzing threads
Rate 0 Requests per second (0 = unlimited). Capped by RoE if active
Timeout 10 Per-request timeout in seconds
Max Time 600 Overall fuzzing timeout in seconds (per target)
Match Codes 200, 201, 204, 301, 302, 307, 308, 401, 403, 405 HTTP status codes to keep
Filter Codes [] HTTP status codes to exclude
Filter Size Response sizes to filter (comma-separated, e.g., 0,4242)
Extensions [] File extensions to append (e.g., .php, .bak, .env)
Recursion false Enable recursive fuzzing into discovered directories
Recursion Depth 2 Maximum recursion depth (1-5)
Auto-Calibrate true Automatically filter false positives
Follow Redirects false Follow HTTP redirects
Custom Headers [] Custom HTTP headers (one per line, Name: Value format)
Smart Fuzz true Fuzz under base paths discovered by crawlers (e.g., /api/v1/FUZZ)
Parallelism 3 Number of targets to fuzz in parallel. Per-target threads are automatically reduced to avoid resource contention (1-10)

Custom Wordlists:

Upload your own .txt wordlists per-project via the FFuf settings UI. Uploaded wordlists appear in the dropdown under "Your custom lists" alongside the built-in SecLists. Maximum file size: 50 MB.

Stealth mode: FFuf is automatically disabled in stealth mode (it is an active brute-force tool).

RoE: When Rules of Engagement are active and FFUF_RATE is 0 (unlimited), it is automatically capped to the RoE max requests per second.


Parameter Discovery (Arjun)

How it works

Arjun (by s0md3v) is a Python-based hidden-parameter discovery tool. The core technique is response-differential analysis: for each endpoint, Arjun sends a baseline request and records the response signature (status code, content length, body hash, response time). Then it injects parameter names from a built-in wordlist (~25,000 common names like debug, admin, access_token, redirect, callback, test, id, uid, userId, email) and watches for any response whose signature differs from the baseline. A difference means the backend read the injected parameter and behaved differently — proof the parameter is recognized even though no link, form, or JS file ever named it.

Why this catches "ghost params" no other tool finds: developers leave parameters wired up long after the UI that used them is gone. A debug toggle (?debug=1) added during a 2-week sprint typically survives across years of deployments because nothing forces it to be removed. These are goldmines for:

  • IDOR — a UI says "you can only see your own profile" but the backend still honors ?userId=42 if you remember to send it
  • Mass assignment — a User model accepts a role field that the frontend never sends, but Arjun finds the backend silently honoring it
  • Debug-mode toggles?debug=true returns stack traces, ?test=1 skips auth checks, ?admin=1 reveals hidden UI
  • SSRF — an internal ?callback=http://... param the API uses for webhooks but never advertises
  • Open redirects?next=, ?return_to=, ?redirect= honored without validation

Multiple HTTP methods in parallel: Arjun tests GET (query string), POST (form-urlencoded), POST (JSON body), and POST (XML body) in parallel. Each method gets its own injection round because backends often accept the same parameter in multiple places (e.g. read from query string in GET, read from JSON body in POST). The output records which method-location combinations work for each found parameter.

Threads + delay: configurable to balance speed vs WAF triggering. Modern WAFs alert on rapid parameter-name fuzzing more than on path fuzzing, so Arjun's defaults are deliberately more conservative than FFuf's.

Output merging: each found parameter becomes a Parameter node MERGE-deduplicated against those already discovered by Katana (form fields), GAU (archived URLs), Hakrawler (link extraction), and JS Recon (JS endpoint extraction). The combined Parameter set is what Nuclei DAST then fuzzes — having Arjun in the pipeline can 2-5x the parameter count fed to DAST, dramatically improving coverage.

Required input: Arjun runs against existing Endpoints (URLs from Katana/GAU/etc.) plus BaseURLs. If those are empty (no resource enum has run yet), Arjun has no targets and exits without findings.

Arjun discovers hidden HTTP query and body parameters on discovered endpoints by testing ~25,000 common parameter names. It finds debug parameters, admin functionality, and hidden API inputs that aren't visible in HTML forms or JavaScript. Runs after FFuf in the pipeline, testing endpoints already discovered by crawlers and fuzzers. Disabled by default.

Graph nodes — consumes: BaseURL, Endpoint | produces: Parameter

Setting Default Description
Enable Arjun false Master toggle for parameter discovery
HTTP Methods GET Methods to test: GET (query params), POST (form body), JSON (JSON body), XML (XML body). Multiple methods run in parallel.
Max Endpoints 50 Maximum number of discovered endpoints to test. API and dynamic endpoints are prioritized over static ones.
Threads 2 Concurrent parameter testing threads per Arjun process
Request Timeout 15s Per-request timeout
Scan Timeout 600s Overall scan timeout per method
Chunk Size 500 Number of parameters tested per request batch. Lower values increase accuracy but make more requests.
Rate Limit 0 Max requests per second (0 = unlimited)
Stable Mode false Add random delays between requests to avoid WAF detection. Forces threads to 1 internally.
Passive Mode false Use CommonCrawl, OTX, and WaybackMachine only — no active requests to target
Disable Redirects false Do not follow HTTP redirects during parameter testing
Custom Headers [] Custom HTTP headers (e.g., auth tokens) added to every request

Stealth mode: Arjun is automatically switched to passive mode in stealth mode (queries archives only, sends no requests to the target).

RoE: When Rules of Engagement are active and ARJUN_RATE_LIMIT is 0 (unlimited), it is automatically capped to the RoE max requests per second.


GraphQL Security Testing

Dedicated GraphQL security testing module that discovers GraphQL endpoints, tests for exposed introspection, extracts the schema, flags sensitive fields, and (optionally) runs the external graphql-cop Docker container for 12 additional misconfiguration checks (alias overloading, batch query DoS, GraphiQL detection, trace mode, CSRF variants, etc.). Runs in parallel with Nuclei — both scanners read BaseURL/Endpoint/Technology and write Vulnerability nodes, but have zero data dependency on each other. Disabled by default.

How it works

GraphQL is fundamentally different from REST: a single endpoint accepts a query language that lets the client describe the exact shape of data it wants. This makes GraphQL much more expressive but introduces a unique attack surface: introspection (the schema-discovery feature) leaks the entire API surface; alias overloading lets an attacker bundle many queries in one request for DoS amplification; batching lets a single HTTP request exfiltrate hundreds of records; and the per-field resolver model means that authorization bugs are usually field-level and trivially bypassable.

Five-source endpoint discovery: the scanner doesn't blindly probe /graphql — it builds an evidence-weighted candidate list:

Source Signal
User-specified Custom Endpoints field in the project form — explicit operator input, highest priority
HTTP probe Endpoints with application/graphql Content-Type or GraphQL indicators in httpx response bodies
Resource enum Endpoints from Katana/Hakrawler/FFuf/GAU whose path contains graphql/gql/query (POST only — GET on these paths usually means a query string with the GraphQL document) — plus endpoints with query, mutation, variables, or operationName parameters
JS Recon Findings of type graphql or graphql_introspection extracted from JS analysis
Pattern probing Common GraphQL paths appended to every BaseURL (/graphql, /api/graphql, /v1/graphql, /v2/graphql). Secondary paths (/query, /api/query, /gql, /api/gql, /graphiql, /api/graphiql, /playground, /api/playground) only on BaseURLs that already showed GraphQL evidence

The evidence-gating on secondary paths matters because blind-probing every BaseURL with 12 path variants generates a lot of 404s; gating to "only probe further on hosts that look graphql-y" cuts noise sharply.

Three-stage probe per endpoint: for each candidate, the scanner sends a sequence:

  1. Sanity probe{ __typename }. Cheapest possible valid GraphQL query. If this fails entirely (timeout / connection refused / 500), the endpoint is marked unreachable and the deeper tests skip it. If it succeeds, the endpoint is confirmed as a GraphQL server.
  2. Simple introspection{ __schema { types { name } } }. Quick check whether introspection is enabled at all. If the response is errors: [{message: "GraphQL introspection is not allowed"}], the scanner stops here and records the endpoint as introspection-disabled.
  3. Deep introspection — full IntrospectionQuery recursing through every type, field, argument, and directive (with the Introspection Depth Limit controlling how many levels of TypeRef fragment recursion it requests). This returns the full schema, which the scanner then walks to count operations, hash for change-detection, and pattern-match field names against the sensitive-field regex (password|token|secret|key|ssn|credit|cvv|pin|apikey|api_key|...).

Auth injection: the auth subsystem supports bearer, cookie, basic, header, and apikey modes. Each emits a different header pattern (Authorization: Bearer <value>, Authorization: Basic <base64>, custom name, etc.). Auth values are masked in logs ([masked]). Used by both the native scanner and the graphql-cop sidecar (forwarded via -H JSON header arg).

graphql-cop sidecar (when enabled): runs dolevf/graphql-cop:1.14 in Docker-in-Docker against each discovered endpoint. graphql-cop is the canonical GraphQL misconfig scanner with 12 tests covering alias overloading (DoS), batch-query DoS, directive overloading, circular query introspection, GraphiQL/Playground IDE exposure, GET-method support (CSRF surface), trace mode (Apollo timing leakage), GET-based mutations (full CSRF), POST url-encoded CSRF, field suggestions, unhandled-error stack traces, and a redundant introspection probe (off by default to dedupe with the native test).

The reason for pinning to 1.14: v1.15+ added an -e flag for excluding tests, but that flag isn't yet on DockerHub. Per-test exclusions are applied post-execution in Python by filtering the JSON output — heavier-traffic DoS tests (alias/batch/directive/circular) still hit the target if the master toggle is on, but the findings are filtered out before merge. For genuine stealth, use the master Enable graphql-cop toggle.

Endpoint capability flags: beyond emitting Vulnerability nodes for positive findings, graphql-cop sets boolean flags directly on the Endpoint node — graphql_graphiql_exposed, graphql_tracing_enabled, graphql_get_allowed, graphql_field_suggestions_enabled, graphql_batching_enabled, graphql_cop_ran. These flags are recorded even when the test result is negative (e.g. graphql_graphiql_exposed: false), so the AI agent can reason about confirmed-safe vs unprobed.

RoE filtering: discovered endpoints are filtered against ROE_EXCLUDED_HOSTS (with *.example.com wildcards supported) before testing. Out-of-scope endpoints are skipped and counted in the endpoints_skipped summary field.

Stealth mode overrides: rate limit forced to 2 req/s, concurrency forced to 1 (sequential), per-request timeout extended to 60s, and the four DoS-class graphql-cop tests forced off. The native introspection probe still runs because it's passive (a single read query).

Graph nodes -- consumes: BaseURL, Endpoint, Domain, Technology | produces: Endpoint (with GraphQL capability flags), Vulnerability, CVE

Core Settings:

Parameter Default Description
Enable GraphQL Security false Master toggle for the GraphQL security scanner (GROUP 6 Phase A)
Introspection Test true Probe each candidate endpoint for exposed introspection (__schema, __type). When enabled, extracts the full schema, counts queries/mutations/subscriptions, computes a schema hash, and flags sensitive fields (password, token, secret, key, ssn, credit, cvv, etc.)
Request Timeout 30 Per-request timeout in seconds (clamped 1-600). Applies to the initial { __typename } probe, the simple introspection query, and the deep introspection query
Rate Limit 10 Maximum requests per second across all endpoints (clamped 0-100, 0 = unlimited). Enforced globally — delay = 1/rate_limit between submissions
Concurrency 5 Parallel endpoint-testing threads (clamped 1-20, auto-reduced when fewer endpoints than threads). Endpoints are tested via ThreadPoolExecutor; 1 forces sequential mode
Introspection Depth Limit 10 Recursion depth for the TypeRef fragment in the full introspection query (clamped 1-20). Higher values extract more info on deeply-wrapped types (NON_NULL → LIST → NON_NULL → NAMED). Lower values avoid server-side query rejection on limit-aware GraphQL engines
Retry Count 3 HTTP retry attempts on transient failures (clamped 0-10). Targets 429, 500, 502, 503, 504 and connection-level errors
Retry Backoff 2.0 Base backoff factor in seconds between retries (clamped 0-10). Uses exponential backoff via urllib3 Retry(backoff_factor=)
Verify SSL true Verify TLS certificates on all GraphQL probes. Disable to test endpoints with self-signed or untrusted certificates
Custom Endpoints Comma-separated GraphQL endpoint URLs to test explicitly, in addition to auto-discovered ones (e.g. https://api.example.com/graphql,https://app.example.com/v1/query)

Endpoint Discovery:

The module auto-discovers GraphQL endpoints from five sources, deduplicated and sorted:

Source How It Discovers
User-specified Values in the Custom Endpoints setting
HTTP probe Endpoints with application/graphql Content-Type or GraphQL indicators in response
Resource enum Katana/Hakrawler/FFuf/GAU endpoints whose path contains graphql, gql, or query (POST only) — plus endpoints with query, mutation, variables, or operationName parameters
JS Recon Findings of type graphql or graphql_introspection extracted from JavaScript analysis
Pattern probing Appends common GraphQL paths to every discovered base URL: /graphql, /api/graphql, /v1/graphql, /v2/graphql. Secondary patterns (/query, /api/query, /gql, /api/gql, /graphiql, /api/graphiql, /playground, /api/playground) are tested only on base URLs that already show GraphQL evidence elsewhere

Authentication:

When Auth Type is set, the scanner attaches auth headers to every introspection probe and to graphql-cop (via -H JSON headers). Authentication values are masked in logs.

Parameter Default Description
Auth Type One of: bearer, cookie, header, basic, apikey (case-insensitive). Empty = no auth
Auth Value The token, cookie string, raw header value, or username:password pair (basic)
Auth Header Name Custom header name used when Auth Type is header (defaults to X-Auth-Token) or apikey (defaults to X-API-Key)

Auth type behavior:

Type Emitted Header
bearer Authorization: Bearer <value>
cookie Cookie: <value>
basic Authorization: Basic <base64(username:password)>
header <Auth Header Name>: <value>
apikey <Auth Header Name or X-API-Key>: <value>

graphql-cop External Scanner (opt-in Docker-in-Docker):

An optional Phase 2 scanner that wraps dolevf/graphql-cop:1.14 and runs 12 additional misconfiguration checks per endpoint. Automatically skipped when disabled or when all 12 per-test toggles are off. Uses Docker-in-Docker — requires the Docker socket to be mounted to the recon container.

Parameter Default Description
Enable graphql-cop false Master toggle for graphql-cop (opt-in — requires Docker socket access)
Docker Image dolevf/graphql-cop:1.14 Docker image to execute. Pinned to 1.14 because the -e exclusion flag (v1.15+) is not yet on DockerHub — per-test exclusions are applied Python-side
Timeout 120 Seconds per endpoint before the container is killed (subprocess.TimeoutExpired)
Force Scan false Pass the -f flag to scan the endpoint even when graphql-cop does not detect it as GraphQL. Useful when the endpoint returns non-standard errors or custom wrappers
Debug Mode false Pass the -d flag to add X-GraphQL-Cop-Test header to every request for correlation with target logs

Network mode: graphql-cop uses the default Docker bridge network. When Use Tor is enabled at the project level, the container is started with --network host and passed the -T flag to route probes through Tor. When a global HTTP_PROXY is set, it is forwarded via -x.

Heavy-traffic tests: alias_overloading, batch_query, directive_overloading, and circular_query_introspection send DoS-class probes. In stealth mode the four DoS toggles are automatically forced to false. Because graphql-cop 1.14 doesn't honor -e, those probes still hit the target if the master toggle is on — use the master Enable graphql-cop toggle for true stealth.

Per-Test Toggles (12 tests — all run by default except introspection):

Each toggle below enables/disables one graphql-cop test. Exclusions are applied post-execution because the v1.14 image ignores the -e flag.

Parameter Default Severity Description
Field Suggestions true info Detects "Did you mean..." schema leakage that bypasses introspection-disabled defences
Introspection (cop) false high Secondary introspection probe — disabled by default to deduplicate with the native introspection test above
GraphiQL IDE Exposed true medium Detects exposed GraphiQL / GraphQL Playground / Apollo Studio IDE pages
GET Method Support true medium Endpoint accepts queries via HTTP GET (enables cache poisoning + CSRF)
Alias Overloading true low Tests server tolerance of aliased-field DoS. DoS — disabled in stealth mode
Array-based Query Batching true low Tests array-batched query DoS amplification. DoS — disabled in stealth mode
Trace Mode true info Apollo tracing extension exposes query timings (schema and resolver info leak)
Directive Overloading true low Tests server tolerance of repeated directives on a single field. DoS — disabled in stealth mode
Circular Introspection true low Recursive introspection query causing exponential parse cost. DoS — disabled in stealth mode
GET-based Mutation true high Mutation allowed over GET (full CSRF surface)
POST url-encoded CSRF true medium Mutation accepts application/x-www-form-urlencoded (cross-origin CSRF possible)
Unhandled Error Detection true info Endpoint leaks stack traces / internal error paths on malformed queries

Endpoint Capability Flags:

Beyond creating Vulnerability nodes on positive findings, graphql-cop also sets these boolean flags directly on the GraphQL Endpoint node — even when a test returned negative (e.g. "GraphiQL exposed: false" is recorded explicitly):

Flag Set By Meaning
graphql_graphiql_exposed detect_graphiql IDE page served at the endpoint
graphql_tracing_enabled trace_mode Apollo tracing extension returns timing data
graphql_get_allowed get_method_support Endpoint accepts GET queries
graphql_field_suggestions_enabled field_suggestions "Did you mean..." responses enabled
graphql_batching_enabled batch_query Server responds to array-batched requests
graphql_cop_ran Set to true after graphql-cop completes

Output:

Per tested endpoint, results are stored under combined_result.graphql_scan.endpoints[endpoint]:

  • introspection_enabled, schema_extracted — booleans
  • queries_count, mutations_count, subscriptions_count — operation counts
  • schema_hash — 16-char SHA256 prefix for change detection
  • operations.{queries,mutations,subscriptions} — lists of operation names
  • error — last error message if tests failed
  • All graphql-cop endpoint-flag booleans from the table above

Scan-wide summary at combined_result.graphql_scan.summary: endpoints_discovered, endpoints_tested, endpoints_skipped (RoE excluded), introspection_enabled, vulnerabilities_found, by_severity.{critical,high,medium,low,info}.

Rules of Engagement: Discovered endpoints are filtered by ROE_EXCLUDED_HOSTS (supports *.example.com wildcards) before testing. Out-of-scope endpoints are skipped and counted in endpoints_skipped.

Stealth mode overrides: GRAPHQL_RATE_LIMIT=2, GRAPHQL_CONCURRENCY=1 (sequential only), GRAPHQL_TIMEOUT=60, and the four DoS-class graphql-cop tests (alias/batch/directive/circular) forced to false. The native introspection test still runs because it is passive.

Partial recon: GraphQL scanning is available as a Partial Recon tool. The modal accepts custom URLs (validated against project scope) that are injected via GRAPHQL_ENDPOINTS and expanded by the same discovery pipeline. GRAPHQL_SECURITY_ENABLED is force-set to true for partial runs regardless of the project toggle. See Recon Pipeline Workflow — Partial Recon.


Subdomain Takeover Detection

Layered takeover scanner that stacks three independent engines against dangling DNS records and orphaned SaaS targets: Subjack (Apache-2.0 Go binary, DNS-first fingerprints), Nuclei takeover templates (-t http/takeovers/ -t dns/, HTTP-fingerprint coverage), and the BadDNS sidecar (AGPL-3.0, opt-in, Docker-in-Docker isolated image with 10 addressable modules covering CNAME/NS/MX/TXT/SPF/DMARC/wildcard/NSEC/zone-transfer/references). Findings are deduplicated across tools on (hostname, provider, method), scored 0-100, mapped to a confirmed / likely / manual_review verdict, and emitted as Vulnerability nodes with source="takeover_scan". Runs in parallel with Nuclei and GraphQL. Disabled by default. See the dedicated Subdomain Takeover Detection page for the full design and scoring rules.

How it works

A subdomain takeover happens when a DNS record (CNAME, NS, MX, A, etc.) points to an external service that the target no longer owns — e.g. a CNAME to oldproject.github.io whose GitHub Pages site has been deleted, leaving the namespace claimable by anyone. The result: an attacker registers the namespace and serves arbitrary content from a hostname under the target's apex.

This module layers three independent detection engines because no single engine catches everything: Subjack is fast and DNS-first but misses pure-HTTP fingerprints; Nuclei templates catch HTTP-fingerprint cases but require alive URLs; BadDNS catches advanced cases (NSEC walking, zone transfers, reference loops) but is heavy and AGPL-isolated.

Scoring algorithm (additive, clamped 0-100):

Signal Weight
Confirmed by 2+ tools +30
Subjack reports confirmed +25
Provider is in the auto-exploitable list +20
Nuclei takeover template matches +15
Detection method is cname (most reliable) +10
Detection method is stale_a or mx (probabilistic, needs human verification) -15
Provider is unknown / not in the lookup table -10

Verdicts: score >= threshold + 10 becomes confirmed; score >= threshold becomes likely; everything else is manual_review. Default threshold is 60, so confirmed >= 70, likely 60-69, manual review <60.

Auto-exploitable providers (single-step claim, +20 confidence bonus): GitHub Pages, Heroku, AWS S3, Shopify, Fastly, Ghost, Unbounce, ReadTheDocs, Surge, Webflow, Tumblr, Statuspage. The full fingerprint table covers ~40 service signals plus ~30 CNAME patterns and is in recon/helpers/takeover_helpers.py::PROVIDER_FROM_SIGNAL.

Subjack engine: native Go binary baked into the recon image. DNS-first — for every Subdomain in the graph, queries CNAME/NS/MX records and matches the targets against the takeover-prone provider list. Fast (~100s of subdomains per minute) and produces low false positives. Optional flags:

  • -ssl / Force HTTPS — probe over HTTPS (default off — most takeovers are detectable on HTTP)
  • -a / Test Every URL — probe every subdomain, not just CNAME-bearing ones (slower, more thorough)
  • -ns / Check NS Takeovers — detects expired NS delegations and dangling cloud-DNS zones (e.g. abandoned Route53 hosted zones)
  • -ar / Check Stale A Records — flags A records pointing to dead cloud IPs (probabilistic — high false-positive rate, requires human verification, scoring penalty applied automatically)
  • -mail / Check SPF/MX Takeovers — audits SPF includes and MX records for dead infrastructure references

Nuclei takeover templates engine: invokes the same nuclei binary as the main vuln scan but with two specific template directories: http/takeovers/ (HTTP-response-body fingerprints for ~50 SaaS providers — "There isn't a GitHub Pages site here" page, Heroku's "No such app" page, etc.) and dns/ (DNS-response-pattern templates). Targets are the alive URLs from httpx, so this layer only fires on hosts that respond to HTTP. Severity filter defaults to critical, high, medium. Has its own rate limit (default 50 req/s) independent of the main vuln scan. Interactsh is always off here since takeover templates don't need OOB.

BadDNS sidecar (opt-in): the 10-module BadDNS toolkit runs in an isolated Docker container (built once via docker compose --profile tools build baddns-scanner). Each module checks a different DNS record class:

Module Catches
cname Standard CNAME-to-dead-provider takeovers
ns Dangling NS delegations (whole-zone takeover potential)
mx MX records pointing to dead mail providers
txt Dangling references in TXT records (verification tokens for services no longer used)
spf SPF includes pointing to dead infrastructure (email spoofing surface)
dmarc DMARC misconfig + reporting-address takeover
wildcard Wildcard-record interaction with takeovers
nsec NSEC zone walking (only opt-in — slow on large zones)
zonetransfer AXFR allowed (full zone disclosure)
references Cross-zone reference loops

Default module set is cname, ns, mx, txt, spf — the others are opt-in because they can be slow or noisy. Custom DNS resolvers can be configured to bypass the system resolvers if needed (useful in environments where the local resolver is rate-limited).

Deduplication and rescan idempotence: findings from all three engines get a deterministic ID hash from (hostname, provider, method) so re-running the scan converges on the same Vulnerability node instead of creating duplicates. When a finding is found by 2+ tools, the +30 confirmation bonus pushes it from likely to confirmed automatically.

Graph nodes -- consumes: Domain, Subdomain, DNSRecord, BaseURL (alive URLs) | produces: Vulnerability (source="takeover_scan", type="subdomain_takeover")

Master toggle:

Parameter Default Description
Enable Subdomain Takeover false Master toggle. When off the whole module is skipped and output contains { "skipped_reason": "disabled" }

Subjack (Apache-2.0, native Go binary baked into the recon image):

Parameter Default Description
Enable Subjack true Enable the DNS-first Subjack layer. Requires the master toggle on
Threads 10 Concurrent Subjack workers (-t, clamped 1-100)
Request Timeout 30 Per-request connection timeout in seconds (-timeout)
Force HTTPS true Probe over HTTPS (-ssl). Improves accuracy against HTTPS-only SaaS providers
Test Every URL false Probe every subdomain, not just CNAME-bearing ones (-a). Slower but more thorough
Check NS Takeovers false Detect expired nameserver delegations and dangling cloud DNS zones (-ns)
Check Stale A Records false Flag A records pointing to dead cloud IPs (-ar). Probabilistic, requires human verification
Check SPF/MX Takeovers false Audit SPF includes and MX records for dead infrastructure references (-mail)
Subjack Run Timeout 900 Overall hard cap on the Subjack subprocess in seconds (minimum 60)

Nuclei Takeover Templates (HTTP fingerprint layer):

Parameter Default Description
Enable Nuclei Takeovers true Enable the Nuclei takeover layer. Targets are the alive URLs from httpx
Nuclei Takeover Run Timeout 1800 Overall hard cap on the Nuclei takeover subprocess in seconds
Severity Filter critical, high, medium Severity filter passed to Nuclei. Defaults to the three action-worthy levels
Rate Limit 50 Nuclei req/s rate limit for this layer only (does not affect the main vuln scan)

Shared from the main Nuclei block: NUCLEI_BULK_SIZE, NUCLEI_CONCURRENCY, NUCLEI_TIMEOUT, NUCLEI_RETRIES, NUCLEI_SYSTEM_RESOLVERS, NUCLEI_FOLLOW_REDIRECTS, NUCLEI_MAX_REDIRECTS, NUCLEI_DOCKER_IMAGE. Global NUCLEI_EXCLUDE_TAGS is not inherited here (would drop the takeover tag and neuter the whole layer). Interactsh is always off for this layer since takeover templates do not need OOB interactions.

Scoring & Verdicts:

Parameter Default Description
Confidence Threshold 60 Minimum score for likely; threshold + 10 for confirmed (clamped 0-100)
Auto-publish Manual Review false Promote manual_review findings from severity=info to severity=medium so they appear in the main findings table instead of the review queue

Scoring is additive: +30 (confirmed by 2+ tools), +25 (Subjack confirmed), +20 (provider in auto-exploitable list), +15 (Nuclei template match), +10 (method=cname), -15 (method=stale_a/mx), -10 (provider unknown). Verdicts: >= threshold+10 -> confirmed; >= threshold -> likely; otherwise manual_review.

BadDNS (AGPL-3.0 isolated Docker-in-Docker sidecar, opt-in):

Parameter Default Description
Enable BadDNS false Opt-in AGPL sidecar. Requires docker compose --profile tools build baddns-scanner once before the first run
Docker Image redamon-baddns:latest Sidecar image tag. Override only when testing a non-default build
Modules cname, ns, mx, txt, spf Active module set. Full addressable list: cname, ns, mx, txt, spf, dmarc, wildcard, nsec, references, zonetransfer. nsec and zonetransfer are opt-in because they can be slow on large targets
Nameservers [] Optional custom DNS resolvers. Empty = system resolvers
BadDNS Run Timeout 1800 Overall hard cap on the baddns subprocess in seconds. Orphan containers are reaped via docker kill <container_name> on timeout

Auto-exploitable providers (single-step claim, +20 confidence bonus): github-pages, heroku, aws-s3, shopify, fastly, ghost, unbounce, readthedocs, surge, webflow, tumblr, statuspage. Full fingerprint table lives in recon/helpers/takeover_helpers.py::PROVIDER_FROM_SIGNAL and covers ~40 signals plus ~30 CNAME patterns.

Stealth mode overrides: NUCLEI_TAKEOVERS_ENABLED=false, BADDNS_ENABLED=false, SUBJACK_ALL=false, SUBJACK_CHECK_NS=true, SUBJACK_CHECK_MAIL=true (both DNS-only and safe at low concurrency), SUBJACK_THREADS=3, TAKEOVER_RATE_LIMIT=10. Subjack stays on in DNS-only mode because CNAME/NS/MX resolution does not generate HTTP traffic to the target.

Partial recon: Subdomain Takeover is a Partial Recon tool. The modal accepts custom subdomains (validated against project scope -- entry must equal the apex or end with .<apex>). User-provided dangling subdomains with no A/AAAA are still scanned because they are the prime takeover candidates. SUBDOMAIN_TAKEOVER_ENABLED is force-set to true for partial runs. Rescans converge on the same Vulnerability.id (deterministic hash of hostname|provider|method) instead of duplicating. See Recon Pipeline Workflow -- Partial Recon.


VHost & SNI Enumeration

Discovers hidden virtual hosts on every target IP by sending two crafted curl probes per candidate hostname. The L7 probe overrides the HTTP Host: header to catch classic Apache/Nginx vhosts that route on the application layer. The L4 probe uses curl --resolve to force the TLS SNI value to the candidate hostname, catching modern reverse proxies (k8s ingress, Traefik, NGINX-ingress, Cloudflare, AWS ALB) that route at the TLS handshake before any HTTP is parsed. Each response is compared to a baseline (raw IP request, no Host override) and anomalies are emitted as Vulnerability nodes with source="vhost_sni_enum". When L7 and L4 disagree on the same hostname, the finding is escalated to host_header_bypass (high severity) — a routing inconsistency that can bypass edge controls. Runs in parallel with Nuclei, GraphQL, and Subdomain Takeover. Disabled by default. See the dedicated VHost & SNI Enumeration page for the full design, candidate-source priority, and severity rules.

How it works

The fundamental insight: many web servers serve different content for https://example.com than they do for https://198.51.100.42 even though those resolve to the same IP. The selection mechanism is one of two things — the HTTP Host: header (L7 routing — Apache, nginx, classic vhosts) or the TLS Server Name Indication value sent during the handshake (L4 routing — most modern reverse proxies, k8s ingress, Cloudflare, AWS ALB). This module probes both layers independently to enumerate the full set of hostnames a target IP serves.

Per-candidate, per-IP probe sequence:

  1. Baselinecurl https://<IP>/ with no Host override and no SNI hint. Records status code, body size, response hash. This is what an unhinted attacker sees.
  2. L7 probecurl -H "Host: <candidate>" https://<IP>/. Sends the configured candidate hostname as the HTTP Host header but keeps the TLS SNI as the IP. If the response differs meaningfully from the baseline, the IP is serving a vhost for that candidate at the application layer.
  3. L4 probecurl --resolve <candidate>:<port>:<IP> https://<candidate>/. Forces curl to send the candidate as both the SNI value and the Host header, but pins the resolution to the target IP. If the response differs from the baseline, the IP is serving a vhost selected at the TLS layer (the candidate's certificate is presented even though the connection went to the IP).

The host_header_bypass finding (escalated to high severity): when L7 and L4 disagree on the same candidate (i.e. L7 returns one response, L4 returns a different one), it means the target's edge proxy and origin disagree on routing. This is exploitable: a request crafted with Host: and SNI different values can route past edge controls (WAF, auth, geo-fencing) to a different origin. This pattern is common in misconfigured CDN-fronted apps and k8s clusters with ingress mismatches.

Candidate source priority (deduplicated across all sources, capped at Max Candidates Per IP):

Priority Source Why
1 Existing Subdomain nodes Already known to be valid hostnames in scope
2 ExternalDomain nodes Known third-party associations — useful for shared-infrastructure detection
3 TLS SAN list from existing Certificates The cert says the host is valid for these names — high signal
4 CNAME targets resolving to this IP DNS evidence of association
5 Reverse-DNS PTR records The IP claims this hostname
6 Bundled vhost-common.txt wordlist ~2,380 common admin/dev/staging/internal/modern-stack prefixes expanded as {prefix}.{apex}
7 Custom user wordlist Per-project additions in the form

When the candidate count exceeds the per-IP cap, excess entries are dropped deterministically (alphabetic sort) so reruns hit the exact same set — enables idempotent finding deduplication.

Severity model:

  • high — L7 and L4 disagree on the same hostname (proxy bypass primitive)
  • medium — discovered hostname matches an internal-keyword pattern (admin, jenkins, k8s, vault, argocd, phpmyadmin, grafana, kibana, gitlab, internal, …) — internal services exposed via vhost
  • low — status code differs from baseline (some response routing happened, but no obvious internal-keyword signal)
  • info — only body-size differs by more than tolerance (subtle routing, mostly noise)

The internal-keyword matcher uses longest-match wins, lex tiebreak so reruns produce the same severity tag for the same hostname.

Performance & concurrency: per-IP probes run in a ThreadPoolExecutor sized by Concurrency (default 20). Each probe has a configurable connect timeout (default 3s); the total budget per probe is 3× that. Baseline Size Tolerance (default 50 bytes) controls how much body-size delta is considered noise vs signal — useful to suppress Set-Cookie / CSRF-token / timestamp jitter that varies between requests.

Hostname injection safety: candidates pass through an RFC-1123 validator that's anchored with \Z (not $, which would let evil\n.example.com slip past). Colons, newlines, spaces, quotes, backticks, dollar signs, NUL bytes, underscores, and labels longer than 63 chars are all rejected before reaching curl --resolve. All subprocess calls use subprocess.run([...], shell=False) — defense in depth even though shell metacharacters can never reach a shell.

Inject Discovered URLs: when on, every confirmed hidden vhost is automatically added as a BaseURL node and pushed into http_probe.by_url so downstream modules (sister Subdomain Takeover scanner, follow-up partial-recon Nuclei/Katana) pick it up. This converts a discovery-phase finding directly into expanded attack surface for vuln scanning in the same run.

Tools used: only curl (already in the recon container) and httpx (already pulled by the HTTP probing phase, used in partial-recon mode only). No new Docker image, pip dependency, or API key — the module is self-contained and doesn't add to the container build.

Graph nodes -- consumes: Subdomain, IP, Port, BaseURL, Certificate, DNSRecord, ExternalDomain | produces: Vulnerability (source="vhost_sni_enum"), BaseURL (for hidden vhosts), defensive Subdomain | enriches: Subdomain (vhost_hidden, vhost_routing_layer, sni_routed, ...), IP (is_reverse_proxy, vhost_baseline_*, hidden_vhost_count, ...)

Master toggle:

Parameter Default Description
Enable VHost & SNI false Master toggle. When off the module is skipped and output contains { "skipped_reason": "disabled" }

Test layers:

Parameter Default Description
L7 (HTTP Host header) true Sends curl -H "Host: candidate" https://IP. Catches classic Apache/Nginx vhost routing
L4 (TLS SNI) true Sends curl --resolve candidate:port:IP https://candidate. Only fires on HTTPS ports. Catches reverse-proxy / k8s ingress / Cloudflare routing

If both layers are off the module exits with { "skipped_reason": "all_layers_disabled" }.

Candidate sources:

Parameter Default Description
Use Graph Candidates true Pull hostnames from existing Subdomain, ExternalDomain, TLS SAN list, CNAME targets, and reverse-DNS PTR records resolving to each target IP. Highest-signal source
Use Default Wordlist true Use the bundled recon/wordlists/vhost-common.txt (~2,380 admin/dev/staging/internal/modern-stack prefixes), expanded as {prefix}.{target_apex} per IP
Custom Wordlist "" Optional newline-separated prefixes/hostnames pasted in the project form. Bare prefixes are expanded against the apex; full hostnames (containing a dot) are used as-is. Stored in a Text column (no length cap). Excluded from project-preset export (per-project content)
Max Candidates Per IP 2000 Hard cap on candidates per IP. Excess entries are dropped deterministically (sorted alphabetically) so reruns hit the same set

Performance:

Parameter Default Clamp Description
Per-Request Timeout 3 >= 1 curl --connect-timeout per probe in seconds. Total budget per probe is 3× this value
Concurrency 20 >= 1 Parallel probes per (IP, port) via ThreadPoolExecutor. Higher = faster, louder
Baseline Size Tolerance 50 >= 0 Bytes of size delta to ignore when status code matches baseline. Suppresses Set-Cookie / CSRF token / timestamp jitter
Inject Discovered URLs true - When a hidden vhost is confirmed, create a BaseURL node and add the URL to http_probe.by_url so downstream tools (sister Subdomain Takeover scanner, follow-up partial-recon Nuclei/Katana) pick it up

Severity model: high when L7 and L4 disagree on the same hostname (proxy bypass primitive); medium when the discovered hostname matches an internal-keyword pattern (admin, jenkins, k8s, vault, argocd, phpmyadmin, etc.); low when status code differs from baseline; info when only body size differs beyond tolerance. The internal-keyword matcher uses the longest match (lex tiebreak) so reruns produce the same severity tag.

Tools used: curl (already in the recon container) and httpx (already pulled by GROUP 4 — used in partial-recon mode only). No new Docker image, pip dependency, or API key.

Hostname injection safety: the candidate pipeline ends with an RFC-1123 validator anchored with \Z (not $, which would let evil\n.example.com slip past). Colons, newlines, spaces, quotes, backticks, dollar signs, NUL bytes, underscores, and labels longer than 63 chars are all rejected before reaching curl --resolve. Subprocess calls use subprocess.run([...], shell=False) — defense in depth even though shell metacharacters can never reach a shell.

Stealth tuning: there is no automatic stealth-mode override at the runtime layer. Stealth is handled at the preset layer instead — red-team-operator sets VHOST_SNI_TEST_L4=false, VHOST_SNI_USE_DEFAULT_WORDLIST=false, VHOST_SNI_CONCURRENCY=5. stealth-recon disables the module entirely (2,380 probes through Tor would be catastrophic).

Partial recon: VHost & SNI is a Partial Recon tool. The modal accepts custom subdomains (added as candidate hostnames, must be in scope) and custom IPs (added as extra targets, validated as IPv4 or CIDR /24-/32). VHOST_SNI_ENABLED is force-set to true for partial runs. Rescans converge on the same Vulnerability.id (deterministic vhost_sni_<host>_<ip>_<port>_<layer>) instead of duplicating. When both graph and custom inputs are empty, the run exits with a "no IP targets" message. See Recon Pipeline Workflow -- Partial Recon.


Vulnerability Scanner (Nuclei)

Template-based vulnerability scanning with 9,000+ community templates.

Graph nodes — consumes: BaseURL, Endpoint, Technology, Domain | produces: Vulnerability, Endpoint, Parameter, CVE, MitreData, Capec

How it works

Nuclei is the heaviest single module in the pipeline by output volume — most web-layer findings come from here. It runs as projectdiscovery/nuclei:latest inside Docker with templates auto-updated from the public ProjectDiscovery template repository on each scan (when Auto Update Templates is on). Templates are stored in a persistent Docker volume so updates are incremental, not from-scratch.

Target construction (UNION-based): nuclei targets are built as the deduplicated UNION of every web-layer source already in the graph:

  1. Endpoints with parameters from Resource Enumeration (Katana/Hakrawler/GAU/Arjun)
  2. BaseURLs verified by httpx (the live web-host set)
  3. http(s)://<sub> fallbacks for any Subdomain whose host isn't already covered by sources 1 or 2

This third bucket exists because it's possible for Domain Discovery to surface a Subdomain that hasn't yet been probed by httpx (e.g. a recently-discovered host or one where httpx errored). Without the fallback, those subdomains would be silently skipped. Nuclei sees them with default ports.

IPs are excluded by default to avoid scanning shared infrastructure. The Scan All IPs toggle includes them when needed (e.g. raw-IP exposed services).

DAST mode is a filter, not an addition: when DAST Mode is on, nuclei is invoked with the -dast flag which filters the loaded template set down to only templates with a fuzz: directive (~300 of the ~9,000 total) — these are the active fuzzing templates for SQLi, XSS, SSRF, OS injection, etc. Detection-class templates (CVE detection, exposure detection, panel detection) are skipped entirely. So if you turn DAST on but use detection-class tags (graphql, apollo, hasura, exposure), the resulting template set is empty and nuclei errors out. Use DAST-native tags only when DAST is on (sqli, xss, ssrf, lfi, rfi, xxe, ssti, openredirect, cmdi).

Most production scans should leave DAST off — the detection templates catch real CVEs and misconfigurations on a much larger set of templates, while DAST is best run as a focused targeted scan after detection finds something interesting.

Interactsh integration: when on, nuclei is wired up to the public Interactsh server (or a self-hosted one). For blind injection templates (blind SQLi, blind SSRF, OOB XXE), nuclei generates a unique callback URL on the Interactsh server, embeds it in the payload, and listens for the OOB hit that confirms the target reached the callback. This catches a class of vulnerabilities that have no in-band response signal at all.

Severity filter and tag includes/excludes: -severity filters which severity levels to keep (excluding info is ~70% faster because info-level templates are the bulk count). -include-tags and -exclude-tags further narrow the template set — dos and fuzz are excluded by default for production scans because they generate volumetric load.

Stream parsing: _execute_nuclei_pass runs nuclei via subprocess.Popen with stdout streamed line-by-line. Each JSON-line is parsed via parse_nuclei_finding and merged into the graph immediately, so progress is visible in the recon log even on multi-hour scans. Per-template-id deduplication keeps the same finding from being recorded multiple times when nuclei retries.

Output enrichment: each finding emits a Vulnerability node + a CVE node (when the template has a CVE ID in its info block) + MitreData / Capec nodes (when the template references CWE/CAPEC). Vulnerability + CVE + MITRE in one pass means nuclei findings are immediately ready for the Insights dashboard's CVE/MITRE views without a separate enrichment phase.

Performance Settings:

Parameter Default Description
Severity Levels critical, high, medium, low, info Severity filter. Excluding "info" is ~70% faster
Rate Limit 100 Requests per second
Bulk Size 25 Hosts processed in parallel
Concurrency 25 Templates executed in parallel
Timeout 10 Request timeout per check (seconds)
Retries 1 Retry attempts for failed requests (0-10)
Max Redirects 10 Maximum redirect chain (0-50)

Template Configuration:

Parameter Default Description
Template Folders [] Directories to include (cves, vulnerabilities, misconfiguration, exposures, etc.). Empty = all
Exclude Template Paths [] Exclude specific directories or files
Custom Template Paths [] Your own templates in addition to the official repo
Include Tags [] Filter by tags: cve, xss, sqli, rce, lfi, ssrf, xxe, ssti. Empty = all
Exclude Tags [] Exclude tags — recommended: dos, fuzz for production

Template Options:

Parameter Default Description
Auto Update Templates true Download latest before scan (+10-30 seconds)
New Templates Only false Only run templates added since last update
DAST Mode true Active fuzzing for XSS, SQLi, RCE (+50-100% time)

Advanced Options:

Parameter Default Description
Headless Mode false Use headless browser for JS pages (+100-200% time)
System DNS Resolvers false Use OS DNS instead of Nuclei defaults
Interactsh true Blind vulnerability detection via out-of-band callbacks
Follow Redirects true Follow HTTP redirects during scanning
Scan All IPs false Scan all resolved IPs, not just hostnames

CVE Enrichment

Enrich findings with CVSS scores, descriptions, and references.

Graph nodes — consumes: Technology | produces: CVE, MitreData, Capec

Parameter Default Description
Enable CVE Lookup true Master toggle
CVE Source nvd Data source: nvd or vulners
Max CVEs per Finding 20 Maximum entries per technology (1-100)
Min CVSS Score 0 Only include CVEs at or above this score (0-10)

Note: NVD and Vulners API keys are configured in Global Settings → API Keys (user-scoped), not in project settings.

How it works

CVE enrichment turns a Technology node like Apache 2.4.49 into a list of CVE nodes attached to the same host. The challenge is that Technology strings come from many sources with inconsistent formatting (httpx, Wappalyzer, Nmap NSE, OSINT tools each format service identifiers differently), so the lookup pipeline goes through three normalization steps before it queries the upstream database:

  1. Server-header splitting (split_server_header): a single Server: Apache/2.4.49 (Ubuntu) PHP/7.4.3 header contains multiple products. The splitter parses this into separate (name, version) tuples (apache 2.4.49, php 7.4.3) — each gets its own CVE lookup.
  2. Technology-string parsing (parse_technology_string): handles formats from Wappalyzer (React 17.0.2), Nmap NSE (OpenSSH 8.2p1), and httpx tech-detect (nginx-1.18.0). Returns a normalized (name, version) pair.
  3. Product-name normalization (normalize_product_name): canonicalizes vendor naming inconsistencies — microsoft-iisiisMicrosoft IIS all map to a single key. Also strips marketing suffixes (Enterprise/Pro/Lite) that NVD doesn't track separately. Semver is extracted via _extract_semver so 2.4.49-deb10u1 becomes 2.4.49 for the CPE lookup.

NVD backend (lookup_cves_nvd): queries https://services.nvd.nist.gov/rest/json/cves/2.0 with the normalized product name + version. The response contains the full CVE record with both CVSS v3.1 and v2.0 metrics — the parser prefers v3.1 (newer, richer attack-vector data) and falls back to v2.0 only when v3 isn't available. CVSS score and severity are extracted from metrics.cvssMetricV31[0].cvssData.baseScore (v3) or metrics.cvssMetricV2[0].cvssData.baseScore (v2). The classified severity (classify_cvss_score) is recomputed from the score using the standard NVD bands (Critical >= 9.0, High >= 7.0, Medium >= 4.0, Low >= 0.1, None = 0).

Vulners backend: queries https://vulners.com/api/v3/burp/software/ — Vulners' Burp-API endpoint that takes a product+version pair and returns the matching CVE list, EPSS exploit probability scores, and exploit-DB references. Vulners is generally faster and includes more recent CVEs than NVD's REST API but requires an API key for any meaningful query volume.

Min CVSS Score filter: applied client-side after the API response. Setting this above 0 dramatically cuts noise (CVE records with no CVSS score get dropped, which is most third-party CMS plugin CVEs).

Max CVEs per Finding cap: applied per-technology before merge. A single Apache version can have 100+ historical CVEs — the cap keeps the graph from being dominated by ancient CVEs that don't matter for current exploitation. Sort order is severity desc, score desc, date desc, so the top-N are always the most exploitable.

API key handling: keys live in Global Settings (user-scoped), so the same key works across projects without re-entry. Without a key, NVD enforces a 5 req/30s rate limit (5 lookups per 30 seconds) — fine for small targets but a bottleneck on large multi-tech graphs.


MITRE Mapping

CWE/CAPEC enrichment of CVE findings.

Parameter Default Description
Auto Update DB true Auto-update CWE/CAPEC database
Include CWE true Map CVEs to CWE weaknesses
Include CAPEC true Map CWEs to CAPEC attack patterns
Enrich Recon CVEs true Enrich CVEs from reconnaissance
Enrich GVM CVEs true Enrich CVEs from GVM scans
Cache TTL (hours) 24 Database cache duration

How it works

The MITRE module enriches every CVE in the graph with two layers of attacker-knowledge metadata: the CWE (Common Weakness Enumeration) — the underlying weakness class — and the CAPEC (Common Attack Pattern Enumeration and Classification) — the attacker techniques used to exploit that weakness class. Together they let you go from "this is a CVE-2021-44228 finding" to "this is a CWE-502 deserialization weakness exploitable via CAPEC-586 (Object Injection) and CAPEC-129 (Pointer Manipulation)".

Database provisioning: on first scan, the module downloads two official MITRE datasets:

  • CWE databasehttps://cwe.mitre.org/data/xml/cwec_latest.xml.zip — the canonical CWE hierarchy (~900 weakness entries) with descriptions, mitigation guidance, and direct links to CAPEC patterns
  • CAPEC databasehttps://capec.mitre.org/data/xml/capec_latest.xml — the canonical CAPEC catalog (~600 attack patterns) with prerequisites, attack steps, and example instances

These XML files are parsed (xml.etree.ElementTree) and converted into JSON databases (cwe_db.json, capec_db.json, cwe_metadata.json) stored in the recon container's data directory. The conversion happens once and the JSONs are reused; the XMLs are big and slow to parse on every scan.

Cache TTL: the database is considered fresh for Cache TTL (hours) after download (default 24h). Once expired, the next scan re-downloads if Auto Update DB is on. This balances "always have the latest CWE/CAPEC entries" against "don't burn 30s downloading XMLs for every scan."

The mapping itself: for every CVE in the graph, the module:

  1. Looks up the CVE's primary CWE from the NVD response (NVD records ship a weaknesses array with the official CWE classification per CVE — usually 1-2 entries)
  2. Loads the CWE node from the local cwe_db.json (description, parent CWE relationships, related CWEs)
  3. Walks the CWE's Related_Attack_Patterns list to find directly-mapped CAPECs
  4. Loads each CAPEC's attack-pattern details (prerequisites, attack-steps narrative, MITRE ATT&CK technique mappings where applicable)
  5. Emits MitreData (CWE) and Capec nodes attached to the CVE in the graph

Why direct mappings only: CWE has a notion of "indirect" relationships (CWE-79 → CWE-78 → CWE-88, transitively related). The module deliberately uses only direct CWE→CAPEC mappings (Include CAPEC toggle controls this) because indirect chains generate noise — a single CVE can transitively map to dozens of unrelated CAPECs through deep CWE inheritance.

Dual-source enrichment: the module enriches CVEs from both reconnaissance (Nuclei findings, Nmap NSE, JS Recon) and GVM (the network-vuln-scan output) when both are enabled. Each toggle controls whether that source's CVEs go through MITRE enrichment — useful when you want fast scans (skip MITRE on recon CVEs which are usually well-known) and only enrich the GVM scan's deeper findings.

Output usage: MitreData and Capec nodes feed two consumers — the Insights dashboard (CWE-breakdown chart, attack-patterns chart, top-CWE-by-frequency) and the AI agent, which reads the attack-pattern descriptions during exploit planning to align its tool selection with documented attack steps for that weakness class.


Security Checks

25+ individual toggle-controlled checks grouped into six categories. Each check creates a Vulnerability node in the graph if the condition is detected.

Graph nodes — consumes: BaseURL, IP, Subdomain, Domain | produces: Vulnerability

How it works

Security Checks runs after every other recon module so it has the full graph (Subdomains, IPs, BaseURLs, Certificates, DNS records) to query. Each check is a small focused Python function that hits a specific configuration question — no Docker, no external tools, just requests + socket + ssl + dns.resolver calls done in parallel via a ThreadPoolExecutor sized by Max Workers.

Each category targets a specific class of misconfiguration:

Network Exposure: checks whether origin IPs leak past a CDN / WAF (the classic WAF-bypass primitive). check_direct_ip_http and check_direct_ip_https open raw IP-based connections (http://198.51.100.42/), but emit a finding only when the IP exposure is a real risk. Without filtering, every cloud-hosted site's load balancer would generate findings; three layers suppress that noise.

  1. Pre-filter on known edges. IPs in Cloudflare's published prefix list (cloudflare.com/ips-v4, ips-v6, fetched per scan with a hardcoded fallback), IPs in known CDN ASNs (Cloudflare 13335 / 209242), and IPs flagged is_cdn=true by Naabu or httpx with a reliable edge CDN name are removed before any probe. The reliable list is cloudflare, cloudfront, akamai, fastly, imperva, incapsula, sucuri, stackpath, azurefrontdoor, gcore. Generic cloud-provider labels (aws, amazon, azure, gcp, google) are deliberately excluded because they cover bare ALB / EC2 / Cloud-LB origins that legitimately serve the application; trusting those would suppress real findings on Cloudflare-fronted apps whose origin happens to be on AWS.

  2. Response fingerprint. Each remaining IP probe is inspected for edge markers: headers cf-ray, cf-cache-status, x-amz-cf-id, x-served-by: cache-..., Server: cloudflare, and body matching Error 1003 / Direct IP access not allowed / Attention Required! | Cloudflare. Match suppresses the finding.

  3. Bare-origin comparison test. The IP response is compared against the response of every hostname that resolves to it. If any hostname and the IP return the same status code, the same Server header, similar content size (within 10% or 500 bytes), and neither carries CDN/WAF markers, the IP is the bare public origin and there is no protection layer in between, so the finding is suppressed. If the hostname carries CDN markers (cf-ray, Server: cloudflare, ...) absent from the IP response, that is a real WAF bypass and the finding fires.

check_ip_api_exposed runs the same prefilter-and-fingerprint flow against API-shaped paths (/api, /graphql, /v1, ...). check_waf_bypass is the consolidated higher-severity finding produced when the comparison test concludes that the hostname goes through a different stack from the IP.

A Direct IP Access finding therefore means one of:

  • Hostname goes through a CDN / WAF, IP does not: real origin leak
  • IP serves materially different content from the hostname (different status, different Server, large size delta): misconfigured exposure
  • No hostnames resolved to the IP, comparison could not run, IP still responds: bare-IP scan target, informational
  • IP 30x-redirects to a hostname: legacy info-severity finding (the IP responds at all, even though it enforces hostname-based access)

It does NOT fire for:

  • Cloudflare / CloudFront / Akamai / Fastly edge IPs (pre-filtered)
  • AWS ALB / Azure Front Door / GCP LB origins where the IP IS the public endpoint by design (suppressed by the bare-origin comparison)
  • IPs that respond with CDN error templates (response fingerprint match)
  • IPs that are unreachable or return 5xx

TLS / Certificateget_ssl_certificate opens a TLS handshake (ssl.create_default_context, verify_mode=CERT_NONE to allow self-signed) and pulls the cert. parse_cert_date parses the validity range, check_tls_expiring_soon computes days-to-expiry against the configurable threshold (default 30 days). Note that verify_mode=CERT_NONE is intentional: scanning self-signed staging hosts is a common need; cert chain verification is a separate concern handled by the broader Certificate node analysis.

Security Headerscheck_security_headers makes a single HEAD (or GET if HEAD is rejected) request and checks for the presence of: Referrer-Policy, Permissions-Policy, Cross-Origin-Opener-Policy (COOP), Cross-Origin-Resource-Policy (CORP), Cross-Origin-Embedder-Policy (COEP). check_cache_control_missing is split out because it's about caching/sensitive-data leakage, not the cross-origin posture. CSP Unsafe Inline parses the Content-Security-Policy header and looks for 'unsafe-inline' directives that defeat XSS protections.

Authenticationcheck_login_no_https walks every BaseURL looking for forms with <input type="password"> and verifies the form action submits to HTTPS (forms posting passwords to HTTP is a credential-exposure issue). check_session_cookies parses Set-Cookie headers from BaseURL responses and flags missing Secure and HttpOnly flags on session-shaped cookies. check_basic_auth_no_tls looks for WWW-Authenticate: Basic responses on HTTP-only endpoints.

DNS Security — uses dns.resolver to query each domain's DNS records:

  • SPF Missing — no v=spf1 ... TXT record
  • DMARC Missing — no v=DMARC1 record at _dmarc.<domain>
  • DNSSEC Missing — no DS/DNSKEY records
  • Zone Transfer — attempts AXFR against the apex's nameservers; if the transfer succeeds, the entire zone is publicly disclosed (high-severity, classic misconfig)

Exposed Services — uses the existing port-scan output to check for common misconfigured services:

  • Admin Port Exposed — port 22 (SSH), 23 (Telnet), 3389 (RDP), 5985/5986 (WinRM), 8089 (Splunk), … reachable from the public internet
  • Database Exposed — 3306 (MySQL), 5432 (Postgres), 27017 (MongoDB), 6379 (Redis), 9200 (Elasticsearch), 11211 (Memcached), 1521 (Oracle) reachable
  • Redis No Auth — opens a TCP connection to Redis port and sends INFO\r\n — if the response includes redis_version: (rather than NOAUTH), Redis is unauthenticated
  • Kubernetes API Exposed — checks 6443 (kube-apiserver) and 10250 (kubelet) for unauthenticated /healthz and /api responses
  • SMTP Open Relay — connects to port 25 and runs the classic relay test (MAIL FROM: <off-domain>RCPT TO: <off-domain> — if the server accepts both, it's an open relay)

ApplicationInsecure Form Action looks for forms with action="http://..." (not just login forms). No Rate Limiting opens a tight loop of 100 requests against a representative endpoint over ~10 seconds and checks whether any 429 / Retry-After response was returned — absence of throttling is a finding.

Why these are split into a dedicated module rather than rolled into Nuclei: Nuclei is template-based and excels at known-CVE matching, but most of these checks are configuration-state questions that don't have a CVE attached. Implementing them as Python functions in the recon container is faster, more testable, and integrates more cleanly with the graph than maintaining custom Nuclei templates would. They're also high-signal findings the AI agent uses to prioritize early-stage exploit attempts (a missing-Secure-cookie finding is a fast lead to session hijacking; an exposed Redis is immediate code-execution).

Global Settings:

Parameter Default Description
Enable Security Checks true Master toggle for all checks
Timeout 10 Per-check timeout (seconds)
Max Workers 10 Concurrent check threads

Network Exposure:

Check Default Description
Direct IP HTTP true HTTP accessible via IP address
Direct IP HTTPS true HTTPS accessible via IP address
IP API Exposed true API endpoints accessible via IP
WAF Bypass true WAF can be bypassed via direct IP

TLS/Certificate:

Check Default Description
TLS Expiring Soon true Certificate expires within configurable days
TLS Expiry Days 30 Days before expiry to trigger warning

Security Headers:

Check Default Description
Missing Referrer-Policy true No Referrer-Policy header
Missing Permissions-Policy true No Permissions-Policy header
Missing COOP true No Cross-Origin-Opener-Policy
Missing CORP true No Cross-Origin-Resource-Policy
Missing COEP true No Cross-Origin-Embedder-Policy
Cache-Control Missing true No Cache-Control header
CSP Unsafe Inline true Content-Security-Policy allows unsafe-inline

Authentication:

Check Default Description
Login No HTTPS true Login form served over HTTP
Session No Secure true Session cookie missing Secure flag
Session No HttpOnly true Session cookie missing HttpOnly flag
Basic Auth No TLS true Basic Authentication without TLS

DNS Security:

Check Default Description
SPF Missing true No SPF record for the domain
DMARC Missing true No DMARC record
DNSSEC Missing true DNSSEC not configured
Zone Transfer true DNS zone transfer allowed

Exposed Services:

Check Default Description
Admin Port Exposed true Administrative ports publicly accessible
Database Exposed true Database ports publicly accessible
Redis No Auth true Redis accessible without authentication
Kubernetes API Exposed true Kubernetes API publicly accessible
SMTP Open Relay true SMTP server allows open relay

Application:

Check Default Description
Insecure Form Action true Form submits over HTTP
No Rate Limiting true No rate limiting detected on endpoints

GVM Vulnerability Scan

Configure GVM/OpenVAS network-level scanning.

Graph nodes — consumes: IP, Port, Subdomain, Domain | produces: Vulnerability, Technology, Traceroute, Certificate, ExploitGvm, CVE, MitreData, Capec

How it works

GVM (Greenbone Vulnerability Management, the OpenVAS suite) is a separate on-demand scanning pipeline rather than part of the main recon flow — start it from the Red Zone toolbar after recon completes. It operates at the network layer where Nuclei doesn't reach: SMB/NetBIOS misconfigs, FTP/SMTP/POP/IMAP weaknesses, SSH cipher audits, SNMP defaults, exposed RPC, Telnet, and deep CVE matching against Nmap-style service fingerprints — all driven by 170,000+ Network Vulnerability Tests (NVTs) maintained by Greenbone.

Architecture: GVM runs as its own dockerized service stack (gvm_scan/) with Greenbone Community Feed pulling NVT updates daily. The recon container talks to GVM via the gmp (Greenbone Management Protocol) over its native socket. The protocol is XML-based and stateful — every scan is a sequence of GMP commands: create target, create task, start task, poll for status, retrieve report, cleanup.

Scan flow for a project:

  1. Target preparation — depending on Targets Strategy (both / ips_only / hostnames_only), the module pulls IPs and/or hostnames from the recon graph and constructs a GMP <create_target> command. Targets exceed-the-default-limit are split into multiple GVM target objects (each capped at GVM's per-target host limit).
  2. Task creation<create_task> with the chosen Scan Profile config UUID. The seven profiles (Full and fast, Full and very deep, Full and very deep ultimate, Discovery, Host discovery, System discovery, Empty) trade speed against thoroughness — Full and fast runs ~50k NVTs in 30-60min on a typical target, Full and very deep ultimate runs ~150k in 4-6 hours.
  3. Task start<start_task>. GVM begins running NVTs against each host in the target. NVTs are written in NASL (Nessus Attack Scripting Language) and run as a forked process per check; GVM internally parallelizes within the configured scanner profile.
  4. Status polling loop — every Poll Interval seconds (default 5, range 5-300), <get_tasks> is queried for the task's progress percentage. The loop continues until the task reports Done status or Task Timeout seconds have elapsed (default 14400 = 4 hours; 0 = unlimited).
  5. Report extraction<get_reports> retrieves the structured XML report. Each finding has CVE IDs, CVSS metrics, NVT family, severity, and remediation guidance. The XML is parsed and converted into Vulnerability + CVE + ExploitGvm graph nodes (ExploitGvm is GVM's exploit-availability indicator distinct from regular CVEs).
  6. Cleanup — when Cleanup After Scan is on, <delete_target> and <delete_task> are called to remove the GVM-side artifacts. Without this, GVM's database accumulates targets/tasks across scans, eventually slowing the management UI.

Why GVM scans are slow: each NVT is a separate forked process running its own protocol implementation. A single SMB-vulnerabilities NVT might take 30-60 seconds per host as it negotiates dialects, attempts authentication, queries shares, etc. Across 170k NVTs and N hosts, the wall-clock can easily exceed a working day. The trade-off vs Nuclei: GVM has dramatically broader and deeper coverage of network services but is impractical for fast iteration.

Output enrichment: the parsed Vulnerability nodes get the same MITRE enrichment pass as recon CVEs (controlled by Enrich GVM CVEs in the MITRE Mapping section) — so GVM findings show up in the Insights dashboard's CWE/CAPEC views with the same depth as Nuclei findings.

Scan Configuration:

Parameter Default Description
Scan Profile Full and fast GVM scan preset — see GVM Vulnerability Scanning for all 7 profiles
Scan Targets Strategy both both (IPs + hostnames), ips_only, or hostnames_only

Timeouts & Polling:

Parameter Default Description
Task Timeout 14400 Maximum seconds per scan task (4 hours). 0 = unlimited
Poll Interval 5 Seconds between status checks (5-300)

Post-Scan:

Parameter Default Description
Cleanup After Scan true Remove targets/tasks from GVM after results are extracted

Subdomain Discovery

Configure passive and active subdomain enumeration. Located in the Discovery & OSINT tab.

Graph nodes — consumes: Domain | produces: Domain, Subdomain, IP, DNSRecord

Each passive source has an enabled toggle and a max results cap. All sources run in parallel and results are merged and deduplicated. After merging, Puredns validates the combined list against public DNS resolvers to remove wildcard and DNS-poisoned entries before DNS resolution proceeds.

How it works

The module fans out the apex domain across five enumeration engines that run concurrently in a ThreadPoolExecutor, then folds their outputs into a single deduplicated set. Each engine has its own discovery strategy:

Engine Source How
crt.sh Certificate Transparency logs HTTPS query against crt.sh?q=%25.<domain>&output=json — extracts every CN/SAN ever issued for the apex. Picks up subdomains that were once requested a TLS cert (most of them)
HackerTarget Passive DNS database HTTPS query against api.hackertarget.com/hostsearch/?q=<domain> — returns historical DNS-resolved hostnames. Free tier: 50 queries/day
Subfinder 50+ passive sources Runs projectdiscovery/subfinder:latest Docker image with a 720-second timeout. Aggregates results from CT logs, DNS databases (SecurityTrails, BinaryEdge), web archives, and search engines (Bing, DNSDumpster). Picks up the highest cardinality of any single tool
Amass 50+ data sources Runs caffix/amass:latest Docker image. Passive mode by default; optional active mode enables zone transfers and certificate name grabs (forced off in stealth). Optional bruteforce mode runs a DNS brute-force after passive enumeration (forced off in stealth, significantly slower)
Knockpy Wordlist-based Runs as a subprocess (no Docker). Active brute-forcing against a built-in wordlist. With Use Bruteforce off it falls back to passive mode

After the five engines complete, results are unioned and deduplicated into a single candidate set. Puredns then runs as a Docker sidecar (frost19k/puredns:latest, 600-second timeout) and validates each candidate against public DNS resolvers — its job is to strip three classes of noise that would otherwise pollute downstream modules:

  1. Wildcard records — a domain like *.<apex> resolves to a single IP for every imaginable hostname, generating thousands of false positives. Puredns identifies the wildcard signature by querying random nonsense subdomains and learning the wildcard answer set, then removes any candidate whose answer matches it.
  2. DNS-poisoned entries — open-resolver poisoning attacks return injected IPs for unrelated domains. Puredns cross-validates against multiple resolvers and discards inconsistent answers.
  3. NXDOMAIN false positives that some passive sources report stale for.

After Puredns, the survivor list is passed to a parallel DNS resolution pass (ThreadPoolExecutor with up to DNS Max Workers threads — default 50, max 200). Each subdomain is queried for all 7 record types simultaneously (A, AAAA, MX, NS, TXT, SOA, CNAME) using a per-hostname inner thread pool when DNS Record Parallelism is on. Each record-type query has its own retry budget (default 3) with exponential backoff.

Tor / proxychains support: when anonymous mode is on, every requests-based call goes through a Tor SOCKS session and every Docker-based call is wrapped in a proxychains prefix that funnels traffic through Tor. Puredns and Amass active mode are both forced off under Tor since their high-rate DNS queries would burn the circuit.

Domain ownership verification: a verify_domain_ownership helper publishes a TXT record under _redamon-verify.<domain> and queries for it — used during scan setup to confirm the operator actually controls the target apex before active mode is allowed.

Why five engines instead of one? Each source has a different blind spot. CT logs miss internal-only subdomains never issued a public cert. Passive DNS databases miss recently-deployed names. Search engines miss subdomains never indexed. Wordlist brute-force misses anything not in the dictionary. Running them in parallel and merging gets coverage that no single tool can match.

Parameter Default Description
crt.sh enabled, max 5000 Certificate Transparency log queries for subdomain discovery
HackerTarget enabled, max 5000 Passive DNS lookup database
Subfinder enabled, max 5000 Passive enumeration using 50+ online sources (CT logs, DNS databases, web archives). Runs via Docker (projectdiscovery/subfinder). No API key required
Amass disabled, max 5000 OWASP Amass subdomain enumeration using 50+ data sources (certificate logs, DNS databases, web archives, WHOIS). Runs via Docker (caffix/amass). No API key required for passive mode
Amass Timeout 10 Enumeration timeout in minutes (1-120)
Amass Active Mode false Enable zone transfers and certificate name grabs — sends DNS queries directly to target. Forced off in stealth mode
Amass Bruteforce false DNS brute forcing after passive enumeration — significantly increases scan time. Forced off in stealth mode
Knockpy Recon enabled, max 5000 Passive wordlist-based subdomain enumeration
Use Bruteforce true Enable Knockpy active subdomain brute-forcing. Domain mode only
Puredns Wildcard Filtering enabled Validates discovered subdomains against public DNS resolvers and removes wildcard entries and DNS-poisoned results. Runs after all discovery tools complete, before DNS resolution. Active tool — sends DNS queries. Runs via Docker (frost19k/puredns). Disabled in stealth mode
Puredns Threads 0 Parallel resolution threads (0 = auto-detect)
Puredns Rate Limit 0 DNS queries per second (0 = unlimited). Capped by RoE global rate limit when enabled
WHOIS Max Retries 3 Retry attempts for WHOIS lookups
DNS Max Retries 3 Retry attempts for DNS resolution
DNS Max Workers 50 Parallel DNS resolution worker threads (was hardcoded at 20) (1-200)
DNS Record Parallelism Enabled Query all 7 DNS record types (A, AAAA, MX, NS, TXT, SOA, CNAME) in parallel per hostname

URLScan.io Enrichment

Passive OSINT enrichment using URLScan.io historical scan data. Runs in the recon pipeline after domain discovery and before port scanning. Located in the Discovery & OSINT tab.

Parameter Default Description
URLScan Enabled false Master toggle for URLScan.io enrichment
Max Results 500 Maximum scan results to fetch per domain (1-10000)

API Key: Optional. Configure in Global Settings → API Keys. Without an API key, only public scan results are available with lower rate limits. With a key, you get access to private scans and higher rate limits.

Graph nodes — consumes: Domain, BaseURL | produces: Domain, Subdomain, ExternalDomain, IP, Endpoint, Parameter. URL paths from historical scans are parsed into Endpoint and Parameter nodes (only when a matching BaseURL already exists from httpx). External domains encountered in scans are tracked as ExternalDomain nodes for situational awareness.

GAU deduplication: When URLScan enrichment runs successfully, the urlscan provider is automatically removed from GAU's data sources to avoid redundant API calls.

How it works

The module hits urlscan.io/api/v1/search/ with the query domain:<apex> OR page.domain:<apex> and paginates through results until either Max Results is reached or the API runs out (each page returns up to 100 records, with a 60-second per-request timeout). For each historical scan record returned, four data extraction passes run on the JSON envelope:

  1. Subdomain harvesting — every task.url and page.url is parsed; anything ending in .<apex> becomes a Subdomain candidate. Anything not ending in .<apex> becomes an ExternalDomain (third-party assets the target loaded — useful for supply-chain mapping).
  2. IP enrichmentpage.ip and task.ip are extracted and merged with existing IP nodes; new IPs trigger downstream port-scan eligibility.
  3. Endpoint reconstruction — the URL path + query string is split into a relative endpoint and individual parameters. These are only attached when a matching BaseURL already exists in the graph (i.e. httpx already confirmed the host serves HTTP) — otherwise they're held aside to avoid orphan endpoints. Each parameter gets its own Parameter node deduplicated against future Arjun/Katana findings.
  4. Metadata pulls — ASN, country, server header, technology fingerprints from page.server and the Wappalyzer rollup are merged into existing nodes. Screenshot URLs (screenshot field) are stored on the BaseURL for later display in the Insights dashboard.

Key rotation is supported transparently: if multiple URLScan keys are set in Global Settings, requests round-robin across them automatically — useful for large target sets that would otherwise hit per-key quotas mid-scan.

Why URLScan is run before port scanning: the IPs and subdomains it surfaces feed into the port-scan target list, expanding coverage to assets that passive DNS/CT logs missed but that someone (the public, a researcher, an automated scan) once submitted to URLScan. This is especially valuable for catching CDN-fronted hosts and short-lived staging environments.


Shodan OSINT Enrichment

Passive internet-wide OSINT enrichment using the Shodan REST API. Runs in the recon pipeline after domain/IP discovery and before port scanning. Located in the Discovery & OSINT tab. Each feature is independently toggled and all require a Shodan API key set in Global Settings.

API Key Required: All toggles are disabled until a Shodan API key is configured in Global Settings. Host Lookup, Reverse DNS, and Passive CVEs automatically fall back to the free InternetDB API when the paid Shodan API returns 403. Domain DNS requires a paid Shodan plan (no free fallback).

Parameter Default Description
Host Lookup false Query each discovered IP for OS, ISP, organization, geolocation, and known vulnerabilities. Uses /shodan/host/{ip} (paid plan: full banners, geo, services) or falls back to InternetDB (free: ports, hostnames, CPEs, CVEs, tags — no geo or banners)
Reverse DNS false Discover hostnames for known IPs. Uses /dns/reverse (paid) or falls back to InternetDB hostnames (free). Can reveal subdomains missed by standard enumeration
Domain DNS false Subdomain enumeration and DNS records via /dns/domain/{domain}. Requires paid Shodan plan — no free fallback. Domain mode only (skipped in IP mode)
Passive CVEs false Extract known CVEs associated with discovered IPs. Reuses Host Lookup data if available; otherwise queries InternetDB directly (free, no key needed)
Workers 5 Parallel IP lookup workers for Shodan/InternetDB queries (1-20)

Graph nodes — consumes: IP, Subdomain, Domain | produces: IP, Port, Service, Subdomain, ExternalDomain, DNSRecord, Vulnerability, CVE. All use MERGE-based deduplication — data from Shodan is automatically merged with findings from Naabu, Nuclei, and other tools.

How it works

The module pulls every IP that prior subdomain discovery has resolved into the graph (_extract_ips_from_recon) and feeds the list to four independent enrichment passes. Each pass owns its own ThreadPoolExecutor (sized by Workers) plus a custom _RateLimiter that paces requests to avoid burning through the daily quota:

  1. Host LookupGET /shodan/host/{ip}. Paid plan returns the full host record: open ports with raw service banners, OS guess, ISP, organization, ASN, geolocation (country/city/lat-lon), domain associations, and the Shodan-CVE list per detected service. On HTTP 403 (free key trying paid endpoint) or 404 (no record), the request falls back to https://internetdb.shodan.io/{ip} — this returns a stripped-down record (ports, CPE strings, CVE list, hostnames, tags) with no rate limit and no key required. The result is unified into a common shape so downstream code doesn't care which source answered.
  2. Reverse DNSGET /dns/reverse?ips=<list>. Paid plan only. The free InternetDB fallback uses the hostnames field from the host lookup instead. Hostnames returned here that match the apex pattern feed back into the Subdomain set — Shodan often surfaces internal-naming-convention hosts (db-prod-1.<apex>, staging-eu-west-2.<apex>) that no passive DNS database knows about.
  3. Domain DNSGET /dns/domain/{domain}. Paid plan only. Returns Shodan's view of every DNS record for the apex plus every subdomain Shodan has ever indexed. Often the single highest-yield enrichment pass for paid plans — surfaces hundreds of subdomains in one request. Skipped silently in IP mode.
  4. Passive CVEs — extracts CVE IDs from the Host Lookup response when available; otherwise issues a separate InternetDB query per IP. CVEs are matched by Shodan against the service version banners — they're advisory until validated by Nuclei or Nmap NSE, but they sharply prioritize which hosts get aggressive vulnerability scanning later.

Key rotation: when multiple Shodan keys are set in Global Settings, every API call rotates through the pool — useful for large IP sets where a single key would hit the 1 query/sec or daily-credit limit. Failed requests on one key automatically retry on the next.

Why Shodan runs before active port scanning: the open ports it returns expand the port-scan target list. If Shodan already knows 198.51.100.42:8080 is open, Naabu/Masscan will probe that port even if it's not in the configured Top Ports range — so the active scan doesn't miss services on weird ports that Shodan has already cataloged.


Uncover Multi-Engine Search

ProjectDiscovery's uncover queries up to 13 search engines simultaneously to discover exposed hosts, IPs, and endpoints associated with the target. Runs before port scanning so discovered assets are processed by all downstream modules.

Parameter Default Description
Uncover Enabled false Enable/disable multi-engine target expansion
Uncover Max Results 500 Maximum results to collect across all engines (1-10,000)
Uncover Docker Image projectdiscovery/uncover:latest Docker image for the uncover container

Key configuration: Uncover automatically reuses API keys configured for standalone OSINT tools (Shodan, Censys, FOFA, ZoomEye, Netlas, CriminalIP). Additional engines require their own keys configured in Global Settings > API Keys under the "Uncover (Multi-Engine Search)" group: Quake, Hunter, PublicWWW, HunterHow, Google Custom Search (key + CX), Onyphe, Driftnet.

IP filtering: All discovered IPs pass through centralized filtering (ip_filter.py) that removes non-routable addresses (RFC 1918, CGNAT, loopback, reserved) and CDN IPs (detected by Naabu/httpx) before entering the pipeline. This prevents wasting API credits on downstream enrichment.

How it works

The module first walks the configured key set in _build_provider_config and assembles two outputs: a list of engines that have valid keys (skipping any silently when no key is set) and the corresponding env-var injection for the uncover container (SHODAN_API_KEY, CENSYS_API_TOKEN, FOFA_KEY, etc.). Engines without keys are simply omitted — the module never fails just because one source is unconfigured.

Next, _build_queries constructs search-engine-specific query strings for the apex domain. Each engine has its own DSL — Shodan uses hostname:/ssl.cert.subject.cn:, Censys uses parsed.names:, FOFA uses domain="...", ZoomEye uses hostname:, etc. — so uncover translates a single conceptual query ("everything related to ") into the right syntax per source. Custom queries can be added through Global Settings.

The container is invoked via docker run --rm projectdiscovery/uncover:latest -e <engines> -q <queries> -limit <max> with all keys piped in as environment variables. uncover internally fans out across the configured engines in parallel and streams JSON results to stdout. The module captures stdout with a configurable timeout, then runs _deduplicate_results to fold identical host:port pairs across engines (one host can appear from Shodan, Censys, and FOFA simultaneously).

_extract_hosts_and_ips then splits each result into structured fields:

  • IPs are validated as IPv4/IPv6 via _is_valid_ip and pushed through the central ip_filter — anything in RFC 1918, CGNAT (100.64.0.0/10), loopback (127/8), link-local (169.254/16), or marked as CDN by Naabu/httpx in the existing graph is dropped before merging.
  • Hostnames are extracted from result URLs via _extract_hostname_from_url and matched against the apex; matches become Subdomain candidates, others become ExternalDomains.
  • Ports discovered alongside the hosts are queued for the same kind of MERGE that Naabu/Masscan does, expanding the port-scan starting set.

merge_uncover_into_pipeline writes everything into the combined recon result so downstream modules see uncover's findings as if they came from the standard discovery path. The merge is idempotent — re-running uncover doesn't duplicate.

Why uncover instead of querying each engine directly? Each engine has its own auth flow, response schema, pagination quirks, and rate limits. uncover normalizes all 13 into one CLI surface and one JSON output schema. The trade-off is that it runs as a single container so retries are at the container level — if your Shodan key 429s mid-run, the whole uncover invocation has to be retried, not just that engine. For finer-grained control over a specific engine, use the dedicated OSINT enrichment modules (Shodan, Censys, FOFA, OTX, Netlas, VirusTotal, ZoomEye, CriminalIP) which all have their own retry, rate-limit, and key-rotation logic.


Threat Intelligence Enrichment (7 OSINT Tools)

Seven passive threat intelligence enrichment tools that run concurrently with port scanning. All tools query external intelligence platforms using IPs and domains discovered during subdomain enumeration. Located in the Discovery & OSINT tab.

API Keys: All API keys are stored in Global Settings > API Keys (user-scoped, not per-project). Project settings contain only enable/disable toggles and optional limits. Enable a tool here, then add its key in Global Settings.

OTX Exception: OTX is enabled by default and works without an API key (anonymous requests, 1,000 req/hr).

Key Rotation: FOFA, OTX, Netlas, VirusTotal, ZoomEye, and CriminalIP support automatic round-robin key rotation — configure extra keys in Global Settings to avoid rate limiting mid-scan.

Graph nodes — consumes: IP, Domain, Subdomain | produces: threat intelligence properties stored on existing IP and Domain nodes (no new node types). Results also written to recon_domain.json under per-tool keys.

How it works (shared mechanics)

All seven enrichment tools follow an identical engineering pattern — only the upstream API and response parser change. Understanding the shared mechanics applies to every tool below:

Target extraction: each tool calls _extract_ips_from_recon(combined_result) to walk the in-progress recon JSON and pull every IP that subdomain discovery has produced so far. Some tools also enrich domains directly (VirusTotal, OTX) using the apex + every discovered Subdomain.

Worker pool + rate limiter: every tool spawns a ThreadPoolExecutor sized by its Workers setting and pairs it with a custom _RateLimiter class. The rate limiter uses a simple time.monotonic()-based interval gate — before each request, the worker calls rate_limiter.wait() which sleeps just long enough to enforce the configured req/sec ceiling. This pacing is per-tool, so one tool hitting its limit doesn't block the others.

Key rotation: when multiple keys are configured in Global Settings, each tool calls _effective_key(settings, key_rotator) (or _otx_effective_key, etc.) to pick the next key from a round-robin rotator. On HTTP 429 the worker either retries with the next key or stops querying and logs the limit-hit — preventing one bad key from poisoning the whole run.

Failure isolation: each tool wraps its API calls in try/except requests.RequestException and degrades gracefully — a single 5xx or timeout from one tool never aborts the others. Tools that hit a hard rate limit (429) return early with whatever they collected before the limit.

Stop-on-rate-limit: most tools log "[!][TOOL] Rate limit hit — stopping for this run" and return their partial results rather than retry-loop. This is deliberate: rate limits usually mean the daily quota is exhausted, and burning a retry loop just delays the rest of the pipeline without producing new data.

Why all run after subdomain discovery but before vuln scanning: the IPs discovered here feed into the port-scan target list, the threat-intelligence flags (VPN/Tor/proxy/scanner) feed into the AI agent's host-prioritization heuristic, and the malware/pulse data feeds into the Insights dashboard's threat-intelligence view.

Censys

Parameter Default Description
Enabled false Enable Censys host intelligence enrichment. Requires both Censys API ID and API Secret in Global Settings
Workers 5 Parallel IP enrichment workers for Censys (1-20)

Queries /v2/hosts/{ip} for each discovered IP. Returns open ports, running services + banners, TLS certificate chains, geolocation, ASN, and OS fingerprint. On HTTP 429 (rate limit), stops querying and logs the limit.

FOFA

Parameter Default Description
Enabled false Enable FOFA internet asset search enrichment. Requires FOFA API Key in Global Settings
Max Results 1000 Maximum rows to fetch per query (hard cap: 10,000)
Workers 5 Parallel IP enrichment workers for FOFA (1-20)

Queries the FOFA API using base64-encoded syntax (domain="<domain>" or per-IP queries). Returns IP:port pairs, HTTP titles, server headers, geolocation, certificate info, and protocol details. Supports both legacy (email:key) and modern (key-only) authentication formats.

OTX (AlienVault Open Threat Exchange)

Parameter Default Description
Enabled true Enable OTX threat intelligence enrichment. Works without an API key (anonymous). Add OTX API Key in Global Settings for higher rate limits
Workers 5 Parallel IP enrichment workers for AlienVault OTX (1-20)

Queries the OTX Indicators API v1 for each IP and domain. Returns threat reputation, pulse count, associated malware families, MITRE ATT&CK attack IDs, passive DNS records (first/last seen), and individual pulse details (adversaries, TLP, tags). Anonymous rate limit: 1,000 req/hr. With API key: 10,000 req/hr.

OTX is the only enrichment tool enabled by default. It requires no API key to function, making it active in every scan out of the box.

Netlas

Parameter Default Description
Enabled false Enable Netlas internet intelligence enrichment. Requires Netlas API Key in Global Settings
Max Results 1000 Maximum items to fetch per query (hard cap: 1,000)
Workers 5 Parallel IP enrichment workers for Netlas (1-20)

Queries the Netlas Responses API (host:{domain} or host:{ip}). Returns port/service data, HTTP response headers and body snippets, geolocation (country, city, latitude/longitude, timezone), TLS certificate details, DNS records, and WHOIS data.

VirusTotal

Parameter Default Description
Enabled false Enable VirusTotal reputation enrichment. Requires VirusTotal API Key in Global Settings
Rate Limit 4 Requests per minute (free-tier limit). Increase for paid plans. On 429, the pipeline automatically waits 65 seconds and retries once
Max Targets 20 Maximum number of domains + IPs to query per scan (caps API usage for large target sets)
Workers 3 Parallel IP enrichment workers for VirusTotal (1-10, lower due to strict rate limits)

Queries VirusTotal API v3 for each discovered domain (/v3/domains/{domain}) and IP (/v3/ip_addresses/{ip}). Returns reputation score, last analysis stats (malicious/suspicious/undetected AV engine counts), categories, tags, JARM fingerprint, registrar, total votes, and last analysis date.

ZoomEye

Parameter Default Description
Enabled false Enable ZoomEye host search enrichment. Requires ZoomEye API Key in Global Settings
Max Results 1000 Maximum items to fetch per query
Workers 5 Parallel IP enrichment workers for ZoomEye (1-20)

Queries the ZoomEye API for hostname and IP searches. Returns open ports, service banners, device type/OS, web application fingerprints, geolocation (country, city, lat/lon, timezone), ASN, ISP, and SSL certificate details.

CriminalIP

Parameter Default Description
Enabled false Enable Criminal IP threat intelligence enrichment. Requires CriminalIP API Key in Global Settings
Workers 5 Parallel IP enrichment workers for CriminalIP (1-20)

Queries the Criminal IP API v1 for each IP (/v1/ip/data?full=true) and domain (/v1/domain/data). Returns IP risk score, threat tags (VPN, cloud, Tor, proxy, hosting, mobile, darkweb, scanner, Snort IDS), geolocation, ISP, hosted services, and abuse history. On HTTP 429, automatically waits 2 seconds and retries once.


GitHub Secret Hunting

How it works

GitHub Secret Hunting is an orchestrated dorking module that searches public GitHub repositories for leaked credentials, API keys, hostnames, and config files referencing the target. It runs as a separate dockerized scanner (github_secret_hunt/) similar to GVM — invoked from the Red Zone toolbar rather than as part of the main recon flow.

Authentication is mandatory: the module uses GitHub's Code Search API (https://api.github.com/search/code) which strictly requires an authenticated request. Without a Personal Access Token (PAT), API access is rate-limited to 10 req/hr and the module is disabled in the UI. With a PAT, the limit jumps to 30 req/min — enough to run a meaningful dorking session.

Dork strategy: the module assembles a curated set of search dorks combining the target apex with secret-shaped tokens. Examples:

  • "<domain>" password
  • "<domain>" api_key
  • "<domain>" AKIA (AWS access key prefix)
  • "<domain>" -----BEGIN (PEM-formatted keys)
  • "<domain>" Bearer
  • extension:env "<domain>" (.env files referencing the domain)
  • extension:json "<domain>" "password"
  • path:.aws/credentials "<domain>"

Each dork is run in turn, paginated through results, with each hit downloading the file blob via https://api.github.com/repos/<owner>/<repo>/contents/<path> and running it through the same secret-detection regex bank used by JS Recon and TruffleHog. This filters regex-only false-positives (random base64 strings that aren't actually secrets) from real exposures.

Scope handling: hits are returned with repo, path, and html_url — recorded as Secret nodes attached to a synthetic source identifier so the AI agent and the Insights dashboard can distinguish "secret found in a public repo" from "secret found on the target's site."

False-positive control: GitHub's search index includes a lot of noise — sample code, security-research repos, documentation files, user-credential dumps unrelated to the target. The module applies several filters: skip repos in known-noise orgs (security blog repos that quote secret patterns), skip files matching dump-shaped paths (leaks/, dumps/, pastebin/), skip files where the secret is also accompanied by [REDACTED] or EXAMPLE markers. These filters cut ~80% of the noise without losing real findings.

Pair with TruffleHog: GitHub Hunting is the public-internet sweep — it finds leaks where someone (employee, contractor, AI tool) committed a target-related secret to a public repo. TruffleHog is the targeted deep-dive — it clones a specific known org/repo list and walks every commit (including deleted/orphan branches) for secrets that aren't in the search index. Use both: GitHub Hunting for breadth, TruffleHog for depth on specific targets.

Configure GitHub repository scanning for leaked credentials.

Graph nodes — consumes: Domain | produces: GithubHunt, GithubRepository, GithubPath, GithubSecret, GithubSensitiveFile

Parameter Default Description
GitHub Access Token Personal Access Token (ghp_...)
Target Organization GitHub org or username to scan
Target Repositories (all) Comma-separated repo names to limit scope
Scan Member Repositories false Include individual member repos
Scan Gists false Search gists for secrets
Scan Commits false Examine git history for removed secrets
Max Commits to Scan 100 Max commits per repo (1-1000)
Output as JSON false Save results as downloadable JSON

See GitHub Secret Hunting for a step-by-step setup guide including how to create a GitHub Personal Access Token.


TruffleHog Secret Scanning

How it works

TruffleHog is the deep companion to GitHub Hunting. Where GitHub Hunting greps the GitHub Code Search API (which only sees what's currently in the default branch), TruffleHog clones the entire repository locally and walks every commit on every branch, including deleted blobs and orphan refs. The result: secrets that were committed and reverted, or rotated and force-pushed-over, are still recoverable because git's content-addressable model keeps the old blobs alive in the object database.

Why this matters: the canonical secret-leakage pattern is "developer commits AWS key → tests CI → realizes they leaked → reverts → force-pushes." The current branch tip looks clean, but the original commit's blob is still in the repo's object database, reachable via reflog or via cloning. GitHub Code Search misses this entirely. TruffleHog finds it.

Detection engine: 700+ regex-based detectors covering AWS, GCP, Azure, Slack, Stripe, Twilio, Okta, GitHub PATs, JWTs, SSH keys, npm tokens, Datadog, Heroku, Mailgun, SendGrid, PagerDuty, OpenAI API keys, Anthropic keys, Google service accounts, plus generic high-entropy patterns (Shannon entropy threshold + character-class checks). Each detector is a self-contained Go module with its own regex and post-match validation.

Live verification mode (the killer feature, toggleable): for each candidate secret, TruffleHog hits the corresponding provider's API with the secret as auth. AWS keys go to https://sts.amazonaws.com/?Action=GetCallerIdentity (a free no-side-effect call); GitHub PATs go to https://api.github.com/user; Slack tokens go to https://slack.com/api/auth.test. If the API call returns a 200 with valid identity info, the secret is verified (high confidence, active, critical severity). If the API returns 401/403/invalid, the secret is unverified (regex matched but the secret is rotated/expired/example — informational).

This eliminates the regex-only false-positive problem that plagues less-sophisticated secret scanners. A typical scan returns 100 regex matches but only 5-10 verified secrets — those 5-10 are the actionable findings. The unverified ones still get recorded for completeness but are deprioritized in the Insights dashboard.

Targeting strategy: TruffleHog operates on either a GitHub org/user (clones every public repo for that org) or a specific repository list. The org option is broad and slow (cloning 100 repos can take 30+ minutes over slow connections); the repo list option is fast and focused (use it when GitHub Hunting has already surfaced specific repos of interest, then deep-scan just those).

Authentication: TruffleHog uses the same GitHub PAT as GitHub Hunting (configured in Global Settings → API Keys). The PAT is needed for cloning private repos when scanning your own org's private repository list, and for higher API rate limits when cloning many public repos.

Output: every verified or unverified secret becomes a Secret node in the graph with the detector name, the verified status, and a redacted snippet of the file context. The AI agent reads verified secrets as immediate exploitation leads (a verified AWS key with iam:* permissions is game-over for the cloud account); unverified ones go to manual review.

Configure TruffleHog secret scanning with 700+ detectors and optional live API verification.

Graph nodes — consumes: Domain | produces: TrufflehogScan, TrufflehogRepository, TrufflehogFinding

Parameter Default Description
Target Organization GitHub org or username to scan
Target Repositories (all) Comma-separated repo names to limit scope
Only Verified false Only report findings verified as active against live APIs
No Verification false Skip all API verification — faster but unconfirmed
Concurrency 8 Concurrent scanning workers (1-20)
Include Detectors (all) Comma-separated detector names to include
Exclude Detectors (none) Comma-separated detector names to exclude

Note: TruffleHog uses the GitHub Access Token from Global Settings > API Keys (shared with GitHub Secret Hunt). See TruffleHog Secret Scanning for a step-by-step setup guide.


Agent Behavior

Configure the AI agent orchestrator for autonomous pentesting.

Agent Behaviour Settings

LLM & Phase Configuration:

Parameter Default Description
Guardrail Enabled true Enable/disable the LLM-based scope guardrail that verifies the target on agent startup. When disabled, the agent skips scope verification. Fail-closed: if the check itself fails, the agent is blocked
LLM Model claude-opus-4-6 AI model for the agent. 400+ models from 5 providers — see AI Model Providers
Deep Think true When enabled, the agent performs an explicit deep reasoning step at key decision points (start of session, phase transitions, failure loops) to plan multi-step attack strategies before acting. Adds ~1 extra LLM call at these moments. Recommended for complex targets with multiple services.
Post-Exploitation Type statefull statefull (Meterpreter sessions) or stateless (one-shot commands)
Activate Post-Exploitation Phase true Whether post-exploitation is available
Informational Phase System Prompt Custom instructions for the informational phase
Exploitation Phase System Prompt Custom instructions for the exploitation phase
Post-Exploitation Phase System Prompt Custom instructions for the post-exploitation phase

Payload Direction:

Parameter Default Description
Tunnel Provider None Dropdown: None (manual LHOST/LPORT), ngrok (single port — free, no VPS), or chisel (multi-port — requires VPS). Only one tunnel can be active at a time. ngrok tunnels port 4444 only, requires the ngrok authtoken configured in Global Settings → Tunneling, auto-detects LHOST/LPORT from the ngrok public URL, stageless payloads only. Requires identity verification on your ngrok account (free). chisel tunnels ports 4444 + 8080, requires Chisel Server URL (and optionally Chisel Auth) configured in Global Settings → Tunneling, enables web delivery and HTA delivery (which need two ports), stageless payloads required (staged payloads fail through the tunnel). Requires a VPS running chisel server -p 9090 --reverse. See AI Agent Guide — Tunnel Providers for setup instructions.
LHOST (Attacker IP) Your IP for reverse shell callbacks. Leave empty for bind mode. Hidden when a tunnel provider is enabled.
LPORT Listening port for reverse shells. Leave empty for bind mode. Hidden when a tunnel provider is enabled.
Bind Port on Target Port the target opens for bind shell payloads
Payload Use HTTPS false Use reverse_https instead of reverse_tcp

Agent Limits:

Parameter Default Description
Max Iterations 100 Maximum LLM reasoning-action loops per objective
Trace Memory Steps 100 Past steps kept in agent's working context
Tool Output Max Chars 20000 Truncation limit for tool output (min: 1000)

Approval Gates:

Parameter Default Description
Require Approval for Exploitation true User confirmation before exploitation phase
Require Approval for Post-Exploitation true User confirmation before post-exploitation phase

Kali Shell — Library Installation:

Parameter Default Description
Allow Library Installation false Let the agent install packages (pip/apt) via kali_shell at runtime. Prompt-based control only — no server-side enforcement. Installed packages are ephemeral (lost on container restart).
Authorized Packages Comma-separated whitelist. If non-empty, only these packages may be installed.
Forbidden Packages Comma-separated blacklist. These packages must never be installed.

Retries, Logging & Debug:

Parameter Default Description
Cypher Max Retries 3 Neo4j query retry attempts (0-10)
Log Max MB 10 Maximum log file size before rotation
Log Backups 5 Number of rotated log backups
Create Graph Image on Init false Generate a LangGraph visualization on startup

Cross-Site Scripting (XSS)

Configure the XSS attack skill (reflected, stored, DOM-based, blind, WAF/CSP bypass).

Parameter Default Description
dalfox WAF Evasion Enabled true Allow dalfox automated scanning + WAF bypass when manual context-aware payloads fail. Runs in background mode (--silence --waf-evasion --deep-domxss --mining-dom)
Blind Callback Enabled false Allow interactsh-client OOB callbacks for blind/stored XSS detection. Opt-in — when enabled, the agent may send document.cookie and other browser data to a third-party callback domain (oast.fun). Disabled by default
CSP Bypass Guidance true Include the CSP bypass reference table in the workflow prompt (covers unsafe-inline, unsafe-eval, JSONP gadgets, nonce reuse, AngularJS template injection, <base> hijack)

See Agent Skills > Cross-Site Scripting (XSS) for the full 8-step workflow documentation.


Hydra Credential Testing

Configure THC Hydra password cracking (50+ protocols: SSH, FTP, RDP, SMB, HTTP forms, databases, etc.).

Agent Skills Settings

Parameter Default Description
Hydra Enabled true Enable/disable Hydra brute force
Threads (-t) 16 Parallel connections per target. Protocol limits: SSH max 4, RDP max 1, VNC max 4
Wait Between Connections (-W) 0 Seconds between each connection. 0 = no delay
Connection Timeout (-w) 32 Max seconds to wait for a response
Stop On First Found (-f) true Stop when valid credentials are found
Extra Password Checks (-e) nsr Additional checks: n=null, s=username-as-password, r=reversed username
Verbose Output (-V) true Show each login attempt
Max Wordlist Attempts 3 Wordlist strategies to try before giving up (1-10)

Social Engineering Simulation

Configure SMTP settings for the phishing agent skill email delivery capability. The agent reads this configuration when the phishing_social_engineering agent skill is active and the user requests email delivery.

Parameter Default Description
SMTP Configuration (empty) Free-text SMTP settings for email delivery. The agent parses this naturally when sending phishing emails via Python smtplib

Example configuration:

SMTP_HOST: smtp.gmail.com
SMTP_PORT: 587
SMTP_USER: pentest@gmail.com
SMTP_PASS: abcd efgh ijkl mnop
SMTP_FROM: it-support@company.com
USE_TLS: true

If left empty, the agent asks the user at runtime for SMTP credentials when email delivery is requested. The agent never attempts to send email without proper SMTP configuration.

See Agent Skills > Social Engineering Simulation for the full phishing workflow documentation.


CypherFix Configuration

Configure CypherFix automated vulnerability remediation. These settings control how the CodeFix agent interacts with your GitHub repository.

CypherFix Settings

Parameter Default Description
GitHub Token (CypherFix) Personal Access Token with repo scope for cloning, pushing, and creating PRs
Default Repository Target repository in owner/repo format (e.g., redis/redis)
Default Branch main Base branch for creating fix branches
Branch Prefix cypherfix/ Prefix for auto-created fix branches (e.g., cypherfix/fix-sqli-42)
Require Approval true Pause before each code edit for human review. When disabled, blocks auto-accept after 5 minutes
LLM Model Override (Agent default) Use a specific model for CodeFix instead of the model configured in Agent Behaviour

See CypherFix — Automated Remediation for the full usage guide.


Tool Phase Restrictions

A matrix controlling which tools the agent can use in each operational phase. Each tool can be independently enabled/disabled per phase. Tools that require an external API key (web_search, shodan, google_dork) display a warning with a quick-add modal when enabled without a key configured in Global Settings.

Tool Informational Exploitation Post-Exploitation
query_graph
web_search
shodan --
google_dork -- --
execute_curl
execute_httpx --
execute_naabu --
execute_subfinder --
execute_gau --
execute_nmap
execute_nuclei --
execute_wpscan --
execute_jsluice --
execute_amass --
execute_katana --
execute_arjun --
execute_ffuf --
kali_shell
execute_code --
execute_playwright
execute_hydra --
metasploit_console --
msf_restart --

This matrix is configurable per project in the dedicated Tool Matrix tab of the project settings form (under the AI Agent tab group).

User MCP Tool Plugins also surface here: any Model-Context-Protocol server you add as a tool plugin via Global Settings → MCP Tool Plugins appears in this same Tool Matrix as a separate "MCP Tool Plugins" group below the built-ins, with the same 3-phase checkboxes per tool. New tools default to all three phases enabled. See the MCP Tool Plugins wiki page for the full operator manual.

Clone this wiki locally