A Docker-based lab environment for automating OSINT reconnaissance with SpiderFoot and interpreting results with AI coding assistants like Claude Code.
SpiderFoot automates open-source intelligence gathering across 200+ modules. This lab provides:
- Docker Compose environment with SpiderFoot and realistic test targets
- Shadow IT simulation with 15+ subdomains mimicking common attack surface patterns
- Mock breach API for credential exposure testing without external dependencies
- CLI tool (
sf-cli) for programmatic scan control - AI integration pattern for piping results to Claude Code for interpretation
Instead of manually correlating findings across dozens of browser tabs, you run a scan and ask Claude to analyze the results:
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "Analyze these OSINT findings. Identify the top 3 risks and recommend actions."- Docker and Docker Compose
- Python 3.10+ with uv (recommended) or pip
- Claude Code CLI (optional, for AI interpretation)
# Clone the repository
git clone https://github.com/drbothen/spiderfoot-claude-code.git
cd spiderfoot-claude-code
# Start the lab (first run builds SpiderFoot image, takes 2-3 min)
./lab.sh up
# Install the CLI tool
uv sync# Scan the local test target
uv run sf-cli scan --target web-target --profile footprint --name "test-scan" --wait
# Get scan results
uv run sf-cli list
uv run sf-cli results --scan-id <SCAN_ID> --format json# Pipe results to Claude for analysis
uv run sf-cli results --scan-id <SCAN_ID> --format json | \
claude -p "Analyze these SpiderFoot OSINT results. Provide:
1) Executive summary
2) Top 3 risks
3) Recommended actions"| Service | URL | Description |
|---|---|---|
| SpiderFoot | http://localhost:5001 | OSINT automation web UI |
| DNS Server | localhost:5353 | dnsmasq for *.acme-corp.lab resolution |
| Breach API | http://localhost:5050 | Mock HIBP-style breach database |
The lab simulates a realistic company attack surface with shadow IT patterns commonly found during real engagements:
| Subdomain | Host Port | IP Address | What It Simulates |
|---|---|---|---|
| www.acme-corp.lab | 8080 | 172.28.0.10 | Production website |
| intranet.acme-corp.lab | 8080 | 172.28.0.10 | Internal portal (exposed) |
| dev.acme-corp.lab | 8082 | 172.28.0.11 | Dev server with debug enabled |
| test.acme-corp.lab | 8082 | 172.28.0.11 | Test environment (debug mode) |
| jenkins.acme-corp.lab | 8082 | 172.28.0.11 | CI/CD server (unauthenticated) |
| staging.acme-corp.lab | 8083 | 172.28.0.12 | Forgotten WordPress 4.9.8 |
| admin.acme-corp.lab | 8083 | 172.28.0.12 | Admin panel (default creds) |
| api.acme-corp.lab | 8084 | 172.28.0.13 | Exposed Swagger documentation |
| grafana.acme-corp.lab | 8084 | 172.28.0.13 | Monitoring dashboard |
| old.acme-corp.lab | 8085 | 172.28.0.14 | Legacy server (PHP 5.4) |
| ftp.acme-corp.lab | 8085 | 172.28.0.14 | FTP server nobody remembers |
| files.acme-corp.lab | 8086 | 172.28.0.15 | File server with exposed .git |
| backup.acme-corp.lab | 8086 | 172.28.0.15 | Backup server (directory listing) |
| vpn.acme-corp.lab | 8087 | 172.28.0.16 | VPN portal with version disclosure |
| shop.acme-corp.lab | 3000 | 172.28.0.30 | E-commerce (Juice Shop) |
| dvwa.acme-corp.lab | 8081 | 172.28.0.31 | Training app (DVWA) |
| Service | URL | Description |
|---|---|---|
| Juice Shop | http://localhost:3000 | OWASP vulnerable web app |
| DVWA | http://localhost:8081 | Damn Vulnerable Web App |
The lab includes a dnsmasq server for resolving *.acme-corp.lab subdomains. To use subdomain resolution:
# Resolve subdomains via lab DNS
dig @localhost -p 5353 dev.acme-corp.lab
dig @localhost -p 5353 api.acme-corp.labAdd the lab DNS as a resolver for the .lab TLD:
# macOS
sudo mkdir -p /etc/resolver
echo "nameserver 127.0.0.1\nport 5353" | sudo tee /etc/resolver/lab
# Linux (systemd-resolved)
# Add to /etc/systemd/resolved.conf.d/lab.conf
[Resolve]
DNS=127.0.0.1#5353
Domains=~labSpiderFoot is already configured to use the lab's DNS server (172.28.0.2) for internal resolution.
The lab includes a local breach database API that simulates Have I Been Pwned functionality. This allows credential exposure demos without external API keys.
# Check a specific email
curl http://localhost:5050/breaches/email/jsmith@acme-corp.lab
# Check all emails for a domain
curl http://localhost:5050/breaches/domain/acme-corp.lab
# Health check
curl http://localhost:5050/health| Breaches | |
|---|---|
| jsmith@acme-corp.lab | AcmeDataLeak2023, LegacySystemLeak |
| it.admin@acme-corp.lab | LegacySystemLeak, PhishingCampaign2024 |
| sarah.ops@acme-corp.lab | AcmeDataLeak2023 |
| bob.developer@acme-corp.lab | GitHubTokenLeak2024, AcmeDataLeak2023 |
| hr@acme-corp.lab | PhishingCampaign2024 |
The lab includes custom SpiderFoot modules that integrate with the breach API automatically:
| Module | Purpose |
|---|---|
sfp_breach_api |
Checks discovered emails against the local breach database |
sfp_email_lab |
Email extractor that accepts .lab TLD (standard module rejects non-internet TLDs) |
These modules are included in the footprint and investigate scan profiles. When you scan acme-corp.lab, SpiderFoot will:
- Crawl web pages and extract email addresses (
sfp_email_lab) - Check each email against the breach database (
sfp_breach_api) - Produce
EMAILADDR_COMPROMISEDevents for matches
# Scan and check for breached credentials
uv run sf-cli scan --target acme-corp.lab --profile footprint --wait
# View compromised emails
uv run sf-cli results --scan-id $SCAN_ID --type EMAILADDR_COMPROMISED# Start a scan with different profiles
uv run sf-cli scan --target example.com --profile footprint
uv run sf-cli scan --target example.com --profile passive # Stealth mode
uv run sf-cli scan --target example.com --profile investigate # Include threat intel
# Wait for completion
uv run sf-cli scan --target example.com --profile footprint --wait# List all scans
uv run sf-cli list
# Get scan status
uv run sf-cli status --scan-id <ID>
# Get detailed status (active modules, discovered IPs/domains)
uv run sf-cli status --scan-id <ID> --detailed
# Get results (JSON for AI processing)
uv run sf-cli results --scan-id <ID> --format json
# Get summary
uv run sf-cli summary --scan-id <ID># Stop a running scan
uv run sf-cli stop --scan-id <ID>
# Delete a scan
uv run sf-cli delete --scan-id <ID>
# List available modules
uv run sf-cli modules| Profile | Use Case | Modules |
|---|---|---|
passive |
Stealth reconnaissance | DNS, WHOIS, archive.org (no direct contact) |
footprint |
Attack surface mapping | Above + port scan, web analysis, email discovery |
investigate |
Threat intelligence | Above + SSL certs, Shodan, threat feeds |
all |
Maximum coverage | All 200+ modules (slow) |
./lab.sh up # Start all services
./lab.sh down # Stop (preserves data)
./lab.sh reset # Full reset (deletes scan data)
./lab.sh status # Show container status
./lab.sh logs # Follow logs
./lab.sh shell # Shell into SpiderFoot container
./lab.sh urls # Show service URLsuv run sf-cli scan --target yourcompany.com --profile footprint --wait
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "Identify all discovered subdomains, IPs, and web apps.
Flag any that appear to be:
- Development or staging environments
- Unmaintained or outdated
- Running vulnerable software"uv run sf-cli scan --target 203.0.113.42 --profile investigate --wait
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "This IP appeared in our security logs. Tell me:
- Is it associated with known threat actors?
- What infrastructure is connected?
- Should we block it?"uv run sf-cli scan --target vendor.com --profile passive --wait
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "Assess this vendor's external security posture.
Evaluate SSL config, credential exposures, tech stack.
Produce a risk score (Low/Medium/High) with justification."Use the lab to practice finding forgotten infrastructure:
# Scan the lab domain
uv run sf-cli scan --target acme-corp.lab --profile footprint --wait
# Analyze discovered subdomains
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "Analyze this attack surface scan. Identify:
1) Shadow IT patterns (dev, staging, test servers)
2) Exposed internal tools (Jenkins, GitLab, Grafana)
3) Legacy/forgotten systems
4) Information disclosure risks
Prioritize findings by exploitability."Test credential exposure workflows with the mock breach API:
# Check breach database for the domain
curl http://localhost:5050/breaches/domain/acme-corp.lab | \
claude -p "Analyze these breach exposures. For each affected user:
1) Assess risk based on breach types
2) Recommend immediate actions
3) Identify patterns (are admins or devs more exposed?)"These external services explicitly permit security scanning for educational purposes:
| Target | URL | What You Can Practice |
|---|---|---|
| ScanMe Nmap | scanme.nmap.org | Port scanning, service detection |
| TestPHP Vulnweb | testphp.vulnweb.com | Web app vuln scanning (Acunetix) |
| TestHTML5 Vulnweb | testhtml5.vulnweb.com | Modern web app scanning |
| TestASPNET Vulnweb | testasp.vulnweb.com | ASP.NET vulnerability scanning |
| HackTheBox | *.hackthebox.com | CTF-style scanning (requires account) |
| TryHackMe | *.tryhackme.com | CTF-style scanning (requires account) |
- Always verify current terms: Check each site's scanning policy before use
- Rate limiting: Be respectful of resources, don't flood with requests
- Passive preferred: Start with passive scans before active enumeration
- Educational only: These are for learning, not offensive operations
# Passive scan of a legal target (no direct probing)
uv run sf-cli scan --target testphp.vulnweb.com --profile passive --wait
# Analyze external reconnaissance
uv run sf-cli results --scan-id $SCAN_ID --format json | \
claude -p "Analyze this passive reconnaissance of a legal test target.
Summarize what can be learned without touching the target directly."Many SpiderFoot modules require API keys from third-party services. You can configure these in two ways:
Pre-configure API keys before starting the lab:
# Copy the example file
cp .env.example .env
# Edit .env and add your API keys
# Only add keys for services you have accounts withThe .env file is gitignored, so your keys stay private. API keys are automatically imported on container startup - no manual configuration required.
Example .env entries:
# API Keys
SFP_SHODAN_API_KEY=your_shodan_key_here
SFP_VIRUSTOTAL_API_KEY=your_virustotal_key_here
SFP_HUNTER_API_KEY=your_hunter_key_here
# Module Options (SFOPT_<MODULE>_<OPTION>=value)
SFOPT__STOR_DB_MAXSTORAGE=0 # Store full web content (required for email extraction)
SFOPT_SPIDER_MAXPAGES=100 # Limit pages crawled per domainBeyond API keys, you can configure SpiderFoot module options via environment variables:
SFOPT_<MODULE>_<OPTION>=value
Format rules:
- Module names are uppercased without the
sfp_prefix - Modules with double underscores (like
sfp__stor_db) use a leading underscore
Examples:
| Environment Variable | SpiderFoot Setting |
|---|---|
SFOPT_SPIDER_MAXPAGES=100 |
sfp_spider:maxpages=100 |
SFOPT__STOR_DB_MAXSTORAGE=0 |
sfp__stor_db:maxstorage=0 |
SFOPT_PORTSCAN_TCP_PORTS=22,80,443 |
sfp_portscan_tcp:ports=22,80,443 |
Important: The SFOPT__STOR_DB_MAXSTORAGE=0 setting is required for email extraction to work. The default (1024 bytes) truncates web content before emails can be found.
Manual import (if needed):
# Re-import API keys without restarting
uv run sf-cli import-keys --env-file .envConfigure API keys through SpiderFoot's web interface:
- Open http://localhost:5001
- Navigate to Settings
- Find modules with a padlock icon (requires API key)
- Enter your API key and click Save
| Service | Module | Use Case | Free Tier |
|---|---|---|---|
| Shodan | sfp_shodan | Device discovery, exposed services | Yes |
| VirusTotal | sfp_virustotal | Malware/threat analysis | Yes |
| Have I Been Pwned | sfp_haveibeenpwned | Credential breach checking | Paid |
| Hunter.io | sfp_hunter | Corporate email discovery | Yes |
| AlienVault OTX | sfp_alienvault | Threat intelligence | Yes |
| SecurityTrails | sfp_securitytrails | Passive DNS, historical data | Yes |
| Censys | sfp_censys | Certificate/host search | Yes |
| BuiltWith | sfp_builtwith | Technology profiling | Yes |
See .env.example for the complete list of supported API keys.
Always obtain authorization before scanning external targets.
This lab includes local test targets for learning SpiderFoot mechanics. For external reconnaissance:
- Only scan domains you own or have written permission to assess
- Respect rate limits and terms of service
- Be aware of privacy regulations (GDPR, CCPA)
- Passive-only scans review public information but still require authorization context
MIT License - See LICENSE for details.
- SpiderFoot - The OSINT automation tool
- Claude Code - AI coding assistant
- Article: Automating OSINT with Claude Code and SpiderFoot - Full walkthrough