Skip to content
/ OSINT Public

Open-Source Intelligence (OSINT) toolkit: modular collectors, enrichment pipeline, link analysis, risk scoring, and investigative workflow automation.

License

Notifications You must be signed in to change notification settings

GPTI314/OSINT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

OSINT Intelligence Platform

A comprehensive, enterprise-grade Open-Source Intelligence (OSINT) platform for gathering, analyzing, and managing intelligence data from various sources.

πŸš€ Features

Core Capabilities

  • Web Scraping Engine - Static, dynamic (JavaScript), and API scraping with proxy rotation, rate limiting, and CAPTCHA solving
  • Web Crawling Engine - Intelligent web crawling with robots.txt compliance, duplicate detection, and politeness management
  • OSINT Intelligence Modules - Domain, IP, email intelligence gathering with risk scoring
  • Data Processing Pipeline - ETL pipeline for data extraction, transformation, and enrichment
  • Analysis Engine - Correlation, entity linking, and threat intelligence analysis
  • REST API - Complete FastAPI-based REST API with authentication and authorization
  • Task Queue - Distributed task processing with Celery and Redis
  • Real-time Processing - Async architecture for high-performance operations

Intelligence Gathering

  • Domain Intelligence - WHOIS, DNS, SSL certificates, subdomains, technology detection
  • IP Intelligence - Geolocation, ISP information, threat intelligence, Shodan integration
  • Email Intelligence - Validation, breach detection, domain verification
  • Social Media Intelligence - Profile discovery and analysis (planned)
  • Image Intelligence - Reverse image search, metadata extraction (planned)

Security & Authentication

  • JWT Authentication - Secure token-based authentication with refresh tokens
  • Role-Based Access Control (RBAC) - Admin, Analyst, and Viewer roles
  • API Key Management - Secure API key generation and validation
  • Audit Logging - Comprehensive audit trail for all operations

Infrastructure

  • PostgreSQL - Relational data storage
  • MongoDB - Document storage for raw data
  • Redis - Caching and message brokering
  • Elasticsearch - Search and analytics (optional)
  • Docker - Containerized deployment
  • Celery - Distributed task queue

πŸ“¦ Installation

Prerequisites

  • Python 3.11+
  • Docker & Docker Compose (for containerized deployment)
  • PostgreSQL 15+
  • Redis 7+
  • MongoDB 6+

Quick Start with Docker

  1. Clone the repository
git clone <repository-url>
cd OSINT
  1. Configure environment variables
cp .env.example .env
# Edit .env with your configuration
  1. Start the platform
docker-compose up -d
  1. Initialize the database
docker-compose exec api alembic upgrade head
  1. Access the platform

Manual Installation

  1. Install dependencies
pip install -r requirements.txt
  1. Configure environment
cp .env.example .env
# Edit .env with your configuration
  1. Initialize database
alembic upgrade head
  1. Run the API server
uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload
  1. Run Celery workers
# In separate terminals
celery -A tasks.celery_app worker --loglevel=info -Q scraping
celery -A tasks.celery_app worker --loglevel=info -Q intelligence
celery -A tasks.celery_app beat --loglevel=info

πŸ”§ Configuration

Key configuration options in .env:

# Application
APP_NAME="OSINT Intelligence Platform"
ENVIRONMENT=development
DEBUG=true

# Database
POSTGRES_HOST=localhost
POSTGRES_DB=osint_platform
POSTGRES_USER=osint_user
POSTGRES_PASSWORD=your_password

# Security
SECRET_KEY=your_secret_key
JWT_SECRET_KEY=your_jwt_secret

# Third-Party APIs
SHODAN_API_KEY=your_shodan_key
VIRUSTOTAL_API_KEY=your_vt_key
IPINFO_API_KEY=your_ipinfo_key

πŸ“– API Documentation

Authentication

# Register a new user
curl -X POST http://localhost:8000/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email":"user@example.com","username":"user","password":"password"}'

# Login
curl -X POST http://localhost:8000/api/v1/auth/login \
  -d "username=user@example.com&password=password"

# Returns: {"access_token": "...", "refresh_token": "..."}

Intelligence Gathering

# Gather domain intelligence
curl -X POST http://localhost:8000/api/v1/intelligence/domain \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"domain":"example.com","include_subdomains":true}'

# Gather IP intelligence
curl -X POST http://localhost:8000/api/v1/intelligence/ip \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"ip_address":"8.8.8.8"}'

# Gather email intelligence
curl -X POST http://localhost:8000/api/v1/intelligence/email \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"email":"user@example.com"}'

Investigation Management

# Create investigation
curl -X POST http://localhost:8000/api/v1/investigations \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"My Investigation","description":"Investigation details"}'

# List investigations
curl -X GET http://localhost:8000/api/v1/investigations \
  -H "Authorization: Bearer YOUR_TOKEN"

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Frontend (TBD)                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    FastAPI REST API                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Auth         β”‚ Intelligence β”‚ Scraping     β”‚ More   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Core Engines                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ Scraping     β”‚ Crawling     β”‚ OSINT Intelligence   β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Background Tasks (Celery)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Data Storage Layer                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ PostgreSQL   β”‚ MongoDB      β”‚ Redis        β”‚ ES     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ§ͺ Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Run specific test module
pytest tests/test_scraping.py

πŸ“Š Monitoring

  • Flower - Celery task monitoring (http://localhost:5555)
  • Prometheus - Metrics collection (optional)
  • Grafana - Metrics visualization (optional)
  • ELK Stack - Log aggregation (optional)

πŸ”’ Security

  • JWT-based authentication
  • Role-based access control
  • API key management
  • Password hashing with bcrypt
  • SQL injection prevention
  • XSS protection
  • CORS configuration
  • Rate limiting

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built with FastAPI, Celery, and asyncio
  • Uses various OSINT tools and services
  • Community-driven development

πŸ“ž Support

  • Documentation: [Link to docs]
  • Issues: [GitHub Issues]
  • Discussions: [GitHub Discussions]

πŸ—ΊοΈ Roadmap

  • Frontend React application
  • Social media intelligence modules
  • Image intelligence and reverse search
  • Threat intelligence integration
  • Dark web monitoring
  • Real-time alerts and notifications
  • Advanced visualization and reporting
  • Machine learning-based analysis
  • API rate limiting and quotas
  • Multi-tenant support

Built with ❀️ for the OSINT community

About

Open-Source Intelligence (OSINT) toolkit: modular collectors, enrichment pipeline, link analysis, risk scoring, and investigative workflow automation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •