SpiderFoot + Claude Code: OSINT Automation with AI

A Docker-based lab environment for automating OSINT reconnaissance with SpiderFoot and interpreting results with AI coding assistants like Claude Code.

What This Does

SpiderFoot automates open-source intelligence gathering across 200+ modules. This lab provides:

Docker Compose environment with SpiderFoot and realistic test targets
Shadow IT simulation with 15+ subdomains mimicking common attack surface patterns
Mock breach API for credential exposure testing without external dependencies
CLI tool (sf-cli) for programmatic scan control
AI integration pattern for piping results to Claude Code for interpretation

Instead of manually correlating findings across dozens of browser tabs, you run a scan and ask Claude to analyze the results:

uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "Analyze these OSINT findings. Identify the top 3 risks and recommend actions."

Quick Start

Prerequisites

Docker and Docker Compose
Python 3.10+ with uv (recommended) or pip
Claude Code CLI (optional, for AI interpretation)

Setup

# Clone the repository
git clone https://github.com/drbothen/spiderfoot-claude-code.git
cd spiderfoot-claude-code

# Start the lab (first run builds SpiderFoot image, takes 2-3 min)
./lab.sh up

# Install the CLI tool
uv sync

Run Your First Scan

# Scan the local test target
uv run sf-cli scan --target web-target --profile footprint --name "test-scan" --wait

# Get scan results
uv run sf-cli list
uv run sf-cli results --scan-id <SCAN_ID> --format json

Interpret with Claude Code

# Pipe results to Claude for analysis
uv run sf-cli results --scan-id <SCAN_ID> --format json | \
  claude -p "Analyze these SpiderFoot OSINT results. Provide:
    1) Executive summary
    2) Top 3 risks
    3) Recommended actions"

Lab Services

Core Infrastructure

Service	URL	Description
SpiderFoot	http://localhost:5001	OSINT automation web UI
DNS Server	localhost:5353	dnsmasq for *.acme-corp.lab resolution
Breach API	http://localhost:5050	Mock HIBP-style breach database

Acme Corp Attack Surface (acme-corp.lab)

The lab simulates a realistic company attack surface with shadow IT patterns commonly found during real engagements:

Subdomain	Host Port	IP Address	What It Simulates
www.acme-corp.lab	8080	172.28.0.10	Production website
intranet.acme-corp.lab	8080	172.28.0.10	Internal portal (exposed)
dev.acme-corp.lab	8082	172.28.0.11	Dev server with debug enabled
test.acme-corp.lab	8082	172.28.0.11	Test environment (debug mode)
jenkins.acme-corp.lab	8082	172.28.0.11	CI/CD server (unauthenticated)
staging.acme-corp.lab	8083	172.28.0.12	Forgotten WordPress 4.9.8
admin.acme-corp.lab	8083	172.28.0.12	Admin panel (default creds)
api.acme-corp.lab	8084	172.28.0.13	Exposed Swagger documentation
grafana.acme-corp.lab	8084	172.28.0.13	Monitoring dashboard
old.acme-corp.lab	8085	172.28.0.14	Legacy server (PHP 5.4)
ftp.acme-corp.lab	8085	172.28.0.14	FTP server nobody remembers
files.acme-corp.lab	8086	172.28.0.15	File server with exposed .git
backup.acme-corp.lab	8086	172.28.0.15	Backup server (directory listing)
vpn.acme-corp.lab	8087	172.28.0.16	VPN portal with version disclosure
shop.acme-corp.lab	3000	172.28.0.30	E-commerce (Juice Shop)
dvwa.acme-corp.lab	8081	172.28.0.31	Training app (DVWA)

Vulnerable Web Apps

Service	URL	Description
Juice Shop	http://localhost:3000	OWASP vulnerable web app
DVWA	http://localhost:8081	Damn Vulnerable Web App

DNS Configuration

The lab includes a dnsmasq server for resolving *.acme-corp.lab subdomains. To use subdomain resolution:

Option 1: Query the Lab DNS Directly

# Resolve subdomains via lab DNS
dig @localhost -p 5353 dev.acme-corp.lab
dig @localhost -p 5353 api.acme-corp.lab

Option 2: Configure Host DNS (macOS/Linux)

Add the lab DNS as a resolver for the .lab TLD:

# macOS
sudo mkdir -p /etc/resolver
echo "nameserver 127.0.0.1\nport 5353" | sudo tee /etc/resolver/lab

# Linux (systemd-resolved)
# Add to /etc/systemd/resolved.conf.d/lab.conf
[Resolve]
DNS=127.0.0.1#5353
Domains=~lab

Option 3: Configure SpiderFoot to Use Lab DNS

SpiderFoot is already configured to use the lab's DNS server (172.28.0.2) for internal resolution.

Mock Breach API

The lab includes a local breach database API that simulates Have I Been Pwned functionality. This allows credential exposure demos without external API keys.

Endpoints

# Check a specific email
curl http://localhost:5050/breaches/email/jsmith@acme-corp.lab

# Check all emails for a domain
curl http://localhost:5050/breaches/domain/acme-corp.lab

# Health check
curl http://localhost:5050/health

Exposed Emails in the Mock Database

Email	Breaches
jsmith@acme-corp.lab	AcmeDataLeak2023, LegacySystemLeak
it.admin@acme-corp.lab	LegacySystemLeak, PhishingCampaign2024
sarah.ops@acme-corp.lab	AcmeDataLeak2023
bob.developer@acme-corp.lab	GitHubTokenLeak2024, AcmeDataLeak2023
hr@acme-corp.lab	PhishingCampaign2024

Using with SpiderFoot

The lab includes custom SpiderFoot modules that integrate with the breach API automatically:

Module	Purpose
`sfp_breach_api`	Checks discovered emails against the local breach database
`sfp_email_lab`	Email extractor that accepts `.lab` TLD (standard module rejects non-internet TLDs)

These modules are included in the footprint and investigate scan profiles. When you scan acme-corp.lab, SpiderFoot will:

Crawl web pages and extract email addresses (sfp_email_lab)
Check each email against the breach database (sfp_breach_api)
Produce EMAILADDR_COMPROMISED events for matches

# Scan and check for breached credentials
uv run sf-cli scan --target acme-corp.lab --profile footprint --wait

# View compromised emails
uv run sf-cli results --scan-id $SCAN_ID --type EMAILADDR_COMPROMISED

CLI Reference

Scan Commands

# Start a scan with different profiles
uv run sf-cli scan --target example.com --profile footprint
uv run sf-cli scan --target example.com --profile passive   # Stealth mode
uv run sf-cli scan --target example.com --profile investigate  # Include threat intel

# Wait for completion
uv run sf-cli scan --target example.com --profile footprint --wait

Results Commands

# List all scans
uv run sf-cli list

# Get scan status
uv run sf-cli status --scan-id <ID>

# Get detailed status (active modules, discovered IPs/domains)
uv run sf-cli status --scan-id <ID> --detailed

# Get results (JSON for AI processing)
uv run sf-cli results --scan-id <ID> --format json

# Get summary
uv run sf-cli summary --scan-id <ID>

Management Commands

# Stop a running scan
uv run sf-cli stop --scan-id <ID>

# Delete a scan
uv run sf-cli delete --scan-id <ID>

# List available modules
uv run sf-cli modules

Scan Profiles

Profile	Use Case	Modules
`passive`	Stealth reconnaissance	DNS, WHOIS, archive.org (no direct contact)
`footprint`	Attack surface mapping	Above + port scan, web analysis, email discovery
`investigate`	Threat intelligence	Above + SSL certs, Shodan, threat feeds
`all`	Maximum coverage	All 200+ modules (slow)

Lab Management

./lab.sh up       # Start all services
./lab.sh down     # Stop (preserves data)
./lab.sh reset    # Full reset (deletes scan data)
./lab.sh status   # Show container status
./lab.sh logs     # Follow logs
./lab.sh shell    # Shell into SpiderFoot container
./lab.sh urls     # Show service URLs

Example Workflows

Attack Surface Discovery

uv run sf-cli scan --target yourcompany.com --profile footprint --wait

uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "Identify all discovered subdomains, IPs, and web apps.
    Flag any that appear to be:
    - Development or staging environments
    - Unmaintained or outdated
    - Running vulnerable software"

IOC Enrichment (Incident Response)

uv run sf-cli scan --target 203.0.113.42 --profile investigate --wait

uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "This IP appeared in our security logs. Tell me:
    - Is it associated with known threat actors?
    - What infrastructure is connected?
    - Should we block it?"

Third-Party Risk Assessment

uv run sf-cli scan --target vendor.com --profile passive --wait

uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "Assess this vendor's external security posture.
    Evaluate SSL config, credential exposures, tech stack.
    Produce a risk score (Low/Medium/High) with justification."

Shadow IT Discovery (Lab)

Use the lab to practice finding forgotten infrastructure:

# Scan the lab domain
uv run sf-cli scan --target acme-corp.lab --profile footprint --wait

# Analyze discovered subdomains
uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "Analyze this attack surface scan. Identify:
    1) Shadow IT patterns (dev, staging, test servers)
    2) Exposed internal tools (Jenkins, GitLab, Grafana)
    3) Legacy/forgotten systems
    4) Information disclosure risks
    Prioritize findings by exploitability."

Credential Exposure Check (Lab)

Test credential exposure workflows with the mock breach API:

# Check breach database for the domain
curl http://localhost:5050/breaches/domain/acme-corp.lab | \
  claude -p "Analyze these breach exposures. For each affected user:
    1) Assess risk based on breach types
    2) Recommend immediate actions
    3) Identify patterns (are admins or devs more exposed?)"

External Legal Targets

These external services explicitly permit security scanning for educational purposes:

Target	URL	What You Can Practice
ScanMe Nmap	scanme.nmap.org	Port scanning, service detection
TestPHP Vulnweb	testphp.vulnweb.com	Web app vuln scanning (Acunetix)
TestHTML5 Vulnweb	testhtml5.vulnweb.com	Modern web app scanning
TestASPNET Vulnweb	testasp.vulnweb.com	ASP.NET vulnerability scanning
HackTheBox	*.hackthebox.com	CTF-style scanning (requires account)
TryHackMe	*.tryhackme.com	CTF-style scanning (requires account)

Usage Notes

Always verify current terms: Check each site's scanning policy before use
Rate limiting: Be respectful of resources, don't flood with requests
Passive preferred: Start with passive scans before active enumeration
Educational only: These are for learning, not offensive operations

Example: External Passive Scan

# Passive scan of a legal target (no direct probing)
uv run sf-cli scan --target testphp.vulnweb.com --profile passive --wait

# Analyze external reconnaissance
uv run sf-cli results --scan-id $SCAN_ID --format json | \
  claude -p "Analyze this passive reconnaissance of a legal test target.
    Summarize what can be learned without touching the target directly."

API Keys for Enhanced Scanning

Many SpiderFoot modules require API keys from third-party services. You can configure these in two ways:

Option 1: Environment File (Recommended)

Pre-configure API keys before starting the lab:

# Copy the example file
cp .env.example .env

# Edit .env and add your API keys
# Only add keys for services you have accounts with

The .env file is gitignored, so your keys stay private. API keys are automatically imported on container startup - no manual configuration required.

Example .env entries:

# API Keys
SFP_SHODAN_API_KEY=your_shodan_key_here
SFP_VIRUSTOTAL_API_KEY=your_virustotal_key_here
SFP_HUNTER_API_KEY=your_hunter_key_here

# Module Options (SFOPT_<MODULE>_<OPTION>=value)
SFOPT__STOR_DB_MAXSTORAGE=0    # Store full web content (required for email extraction)
SFOPT_SPIDER_MAXPAGES=100      # Limit pages crawled per domain

Module Options

Beyond API keys, you can configure SpiderFoot module options via environment variables:

SFOPT_<MODULE>_<OPTION>=value

Format rules:

Module names are uppercased without the sfp_ prefix
Modules with double underscores (like sfp__stor_db) use a leading underscore

Examples:

Environment Variable	SpiderFoot Setting
`SFOPT_SPIDER_MAXPAGES=100`	`sfp_spider:maxpages=100`
`SFOPT__STOR_DB_MAXSTORAGE=0`	`sfp__stor_db:maxstorage=0`
`SFOPT_PORTSCAN_TCP_PORTS=22,80,443`	`sfp_portscan_tcp:ports=22,80,443`

Important: The SFOPT__STOR_DB_MAXSTORAGE=0 setting is required for email extraction to work. The default (1024 bytes) truncates web content before emails can be found.

Manual import (if needed):

# Re-import API keys without restarting
uv run sf-cli import-keys --env-file .env

Option 2: Web UI

Configure API keys through SpiderFoot's web interface:

Open http://localhost:5001
Navigate to Settings
Find modules with a padlock icon (requires API key)
Enter your API key and click Save

Key Services for Enhanced Scanning

Service	Module	Use Case	Free Tier
Shodan	sfp_shodan	Device discovery, exposed services	Yes
VirusTotal	sfp_virustotal	Malware/threat analysis	Yes
Have I Been Pwned	sfp_haveibeenpwned	Credential breach checking	Paid
Hunter.io	sfp_hunter	Corporate email discovery	Yes
AlienVault OTX	sfp_alienvault	Threat intelligence	Yes
SecurityTrails	sfp_securitytrails	Passive DNS, historical data	Yes
Censys	sfp_censys	Certificate/host search	Yes
BuiltWith	sfp_builtwith	Technology profiling	Yes

See .env.example for the complete list of supported API keys.

Legal Notice

Always obtain authorization before scanning external targets.

This lab includes local test targets for learning SpiderFoot mechanics. For external reconnaissance:

Only scan domains you own or have written permission to assess
Respect rate limits and terms of service
Be aware of privacy regulations (GDPR, CCPA)
Passive-only scans review public information but still require authorization context

License

MIT License - See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs		configs
modules		modules
sample-iocs		sample-iocs
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
lab.sh		lab.sh
pyproject.toml		pyproject.toml

License

drbothen/spiderfoot-claude-code

Folders and files

Latest commit

History

Repository files navigation

SpiderFoot + Claude Code: OSINT Automation with AI

What This Does

Quick Start

Prerequisites

Setup

Run Your First Scan

Interpret with Claude Code

Lab Services

Core Infrastructure

Acme Corp Attack Surface (acme-corp.lab)

Vulnerable Web Apps

DNS Configuration

Option 1: Query the Lab DNS Directly

Option 2: Configure Host DNS (macOS/Linux)

Option 3: Configure SpiderFoot to Use Lab DNS

Mock Breach API

Endpoints

Exposed Emails in the Mock Database

Using with SpiderFoot

CLI Reference

Scan Commands

Results Commands

Management Commands

Scan Profiles

Lab Management

Example Workflows

Attack Surface Discovery

IOC Enrichment (Incident Response)

Third-Party Risk Assessment

Shadow IT Discovery (Lab)

Credential Exposure Check (Lab)

External Legal Targets

Usage Notes

Example: External Passive Scan

API Keys for Enhanced Scanning

Option 1: Environment File (Recommended)

Module Options

Option 2: Web UI

Key Services for Enhanced Scanning

Legal Notice

License

Related

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages