Fraud Detection Engine

Real-time transaction fraud scoring with explainable risk assessment. Built for fintech applications requiring stateful fraud detection with velocity checks, geolocation analysis, and behavioral profiling.

🎯 Overview

This fraud detection engine analyzes financial transactions in real-time using rule-based pattern matching and behavioral analytics. Unlike simple threshold-based systems, it maintains user profiles, tracks device patterns, calculates geographic impossibilities using the Haversine formula, and provides transparent explainability through reason codes.

Key Features

✅ Real-time scoring - 0-100 fraud score with 4-tier risk levels
✅ Stateful detection - SQLite-backed transaction history and user profiling
✅ 8 fraud rules - Velocity, amount anomalies, location patterns, device fingerprinting
✅ Geospatial analysis - Impossible travel detection using Haversine distance
✅ Explainability - Reason codes and severity breakdown for every assessment
✅ REST API - Flask endpoints with webhook support for alerts
✅ Synthetic testing - Built-in fraud scenario generator
✅ Production-ready - Comprehensive test suite, JSON logging, input validation

🚀 Quick Start (macOS)

Prerequisites

# Python 3.8+
pip3 install -r requirements.txt

Run Examples

# See fraud detection in action
python3 fraud_detector.py

Output shows 4 scenarios:

✅ Legitimate transaction (score: 0)
⚠️ Velocity attack (score: 40)
⚠️ Large first transaction (score: 50)
🔶 Impossible travel (score: 60)

Run Tests

python3 test_fraud_detector.py

Runs 18 comprehensive tests covering rules, scenarios, and edge cases.

Start API Server

python3 api.py

Server runs on http://localhost:5000

Generate Test Data

python3 generate_test_data.py

Creates test_scenarios.json with 6 fraud scenarios.

📊 How It Works

Architecture

Transaction Input
       ↓
[Transaction Store] ← SQLite in-memory database
       ↓
[Rules Engine] ← 8 modular fraud rules
       ↓
[Fraud Detector] ← Aggregates scores + generates assessment
       ↓
Fraud Assessment Output (score, risk level, reasons, recommendations)

Fraud Rules

Rule	Description	Severity	Score Impact
Velocity (10min)	3+ transactions in 10 minutes	High	+40 pts
Velocity (60min)	10+ transactions in 1 hour	Medium	+25 pts
Large Transaction	>3x user's average amount	Medium	+25 pts
New Device	Transaction from unknown device	Medium	+20 pts
Device Velocity	Device used by 5+ accounts in 1hr	High	+35 pts
Impossible Travel	Requires >900 km/h travel speed	Critical	+60 pts
Round Dollar	Exact $500, $1000, etc (card testing)	Low	+10 pts
High-Risk Category	Gift cards, wire transfers, crypto	Medium	+15 pts

Risk Levels

LOW (0-25): Approve - Process normally
MEDIUM (26-50): Challenge - Require 2FA
HIGH (51-75): Review - Manual review required
CRITICAL (76-100): Block - Automatically decline

Feature Engineering

Transaction Store extracts:

User profile (lifetime spend, avg amount, known devices/locations)
Transaction velocity (count in time windows)
Device fingerprints
Location history

Rules Engine calculates:

Coefficient of variation (income stability)
Haversine distance (geographic movement)
Time-series patterns (rapid-fire detection)
Behavioral anomalies (deviation from norms)

🛠️ API Usage

Assess Single Transaction

curl -X POST http://localhost:5000/api/assess \
  -H "Content-Type: application/json" \
  -d '{
    "transaction_id": "txn_001",
    "user_id": "user_alice",
    "amount": 1500.00,
    "merchant": "Apple Store",
    "category": "Electronics",
    "timestamp": "2026-01-23T10:30:00Z",
    "location": {"lat": 37.7749, "lon": -122.4194},
    "device_id": "device_new_abc123",
    "ip_address": "192.168.1.100"
  }'

Response:

{
  "transaction_id": "txn_001",
  "fraud_score": 50,
  "risk_level": "medium",
  "triggered_rules": [
    "First large transaction",
    "Round dollar amount",
    "High-risk category"
  ],
  "reason_codes": [
    "First large transaction: $1500.00 transaction on new account",
    "Round dollar amount: Exact $1500.00 (common in card testing)",
    "High-risk category: Category 'Electronics' is high-risk"
  ],
  "recommended_action": "CHALLENGE: Require additional authentication (2FA, security questions).",
  "details": {
    "rules_evaluated": 8,
    "rules_triggered": 3,
    "severity_breakdown": {"low": 1, "medium": 2}
  }
}

Batch Assessment

curl -X POST http://localhost:5000/api/assess/batch \
  -H "Content-Type: application/json" \
  -d '{
    "transactions": [
      {transaction_1},
      {transaction_2},
      ...
    ]
  }'

Response includes summary:

{
  "results": [...],
  "summary": {
    "total": 100,
    "low_risk": 85,
    "medium_risk": 10,
    "high_risk": 4,
    "critical_risk": 1
  }
}

Register Webhook

curl -X POST http://localhost:5000/api/webhooks \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/fraud-alert",
    "events": ["high", "critical"]
  }'

Webhooks are triggered automatically for high-risk transactions.

Get User History

curl "http://localhost:5000/api/user/history?user_id=user_alice&time_window_minutes=60"

Returns transaction history and user profile.

📈 Example Scenarios

Scenario 1: Legitimate Transaction

Input:

{
  "amount": 45.50,
  "merchant": "Starbucks",
  "category": "Food & Dining"
}

Output:

Score: 0/100 (LOW)
Triggered Rules: None
Action: APPROVE

Scenario 2: Velocity Attack

Pattern: 5 transactions in 3 minutes

Output:

Score: 40/100 (MEDIUM)
Triggered: "Velocity: 3+ txns in 10min"
Action: CHALLENGE - Require 2FA

Scenario 3: Impossible Travel

Pattern: SF → NYC in 30 minutes (requires 8,258 km/h)

Output:

Score: 60/100 (HIGH)
Triggered: "Impossible travel: 4129km in 0.5h"
Action: REVIEW - Hold for manual verification

Scenario 4: Card Testing

Pattern: 10 small transactions ($1, $5, $10) in rapid succession

Output:

Score: 40-50/100 (MEDIUM)
Triggered: Velocity + Round amounts
Action: CHALLENGE

Scenario 5: Device Sharing Fraud

Pattern: 7 different accounts using same device in 1 hour

Output:

Score: 35+/100 (MEDIUM-HIGH)
Triggered: "Device velocity: 5+ accounts in 60min"
Action: REVIEW

🧪 Testing

Test Suite Coverage

18 comprehensive tests across 5 categories:

1. Transaction Store Tests (4 tests)

Adding transactions
Duplicate rejection
User profile creation
Time window queries

2. Fraud Rules Tests (5 tests)

Velocity detection
Large transaction detection
New device detection
Impossible travel calculation
Round dollar detection

3. Fraud Detector Tests (3 tests)

Legitimate transactions
High fraud scores
Risk level boundaries

4. Synthetic Scenarios Tests (3 tests)

Velocity attack detection
Impossible travel detection
Card testing patterns

5. Input Validation Tests (3 tests)

Invalid coordinates
Negative amounts
Missing required fields

Run Tests

python3 test_fraud_detector.py

Expected output:

Ran 18 tests in 0.013s
OK

✅ All tests passed!

🔧 Technical Implementation

New Technologies & Patterns

This project demonstrates technologies and patterns different from previous projects:

SQLite with indexes - Stateful in-memory database (vs PostgreSQL in Budget Buddy/Stress Simulator)
Geospatial calculations - Haversine formula for impossible travel detection
Dataclasses with validation - Python type-safe models (different approach than Pydantic)
JSON structured logging - Production event logging for fraud detection
Time-series velocity detection - Real-time pattern matching algorithms
Device fingerprinting - Security-focused identity tracking
Webhook notification system - Event-driven alerting architecture
Synthetic fraud scenarios - Automated test data generation for fraud patterns

Domain expertise: Real-time fraud detection and transaction security (vs consumer budgeting/planning tools)

Code Structure

fraud-detection-engine/
├── fraud_detector.py           # Main detection engine
├── transaction_store.py        # SQLite storage + user profiling
├── rules_engine.py             # Modular fraud rules
├── api.py                      # Flask REST API
├── test_fraud_detector.py      # Comprehensive test suite
├── generate_test_data.py       # Synthetic scenario generator
├── test_scenarios.json         # Pre-generated test data
├── requirements.txt            # Dependencies
├── .gitignore                  # Git ignore file
└── README.md                   # This file

Performance Characteristics

Scoring Speed: <2ms per transaction
API Latency: ~20ms (including network)
Batch Processing: 500+ transactions/second
Memory Footprint: ~30MB (in-memory DB)
Database: SQLite (in-memory for speed, persistent option available)

💡 Production Considerations

Scaling

Current (Demo):

In-memory SQLite
Synchronous processing
Single-threaded

Production Recommendations:

PostgreSQL/MySQL for persistence
Redis for caching + rate limiting
Async task queue (Celery/RabbitMQ)
Horizontal scaling with load balancer
Webhook retries with exponential backoff

Security

Rate limit API endpoints (100 req/min per IP)
Encrypt PII fields (IP addresses, device IDs)
Audit logs for all assessments
HTTPS only in production
API authentication (OAuth2/JWT)

Monitoring

Track fraud score distribution
Monitor false positive/negative rates
Alert on rule effectiveness degradation
Dashboard for real-time fraud activity
A/B test rule threshold adjustments

Compliance

PCI DSS: Never store card numbers
GDPR: Right to deletion, data minimization
Fair Lending: Avoid discriminatory patterns
Audit Trail: Log all decisions for review

🎨 Future Enhancements

📚 Use Cases

Fintech Applications

Payment Processors (Stripe, Square)
- Real-time transaction screening
- Chargeback prevention
- Merchant risk scoring
Neobanks (Chime, Current)
- Account takeover detection
- P2P fraud prevention
- New account monitoring
Buy Now Pay Later (Affirm, Klarna)
- First-party fraud detection
- Synthetic identity detection
- Checkout abuse prevention
Crypto Exchanges (Coinbase, Kraken)
- Withdrawal fraud prevention
- Account verification
- AML transaction monitoring
Marketplaces (eBay, Etsy)
- Seller fraud detection
- Buyer protection
- Dispute resolution

🔍 How This Differs from Creditworthiness Scorer

Both projects show fintech risk assessment, but focus on different domains:

Feature	Creditworthiness Scorer	Fraud Detector
Purpose	Lending decisioning	Transaction security
Timing	One-time (application)	Real-time (every txn)
Data	Historical cash flow	Current + historical txns
Storage	Stateless	Stateful (SQLite)
Features	DTI, income CV, buffer	Velocity, location, device
Output	Loan approval/denial	Approve/challenge/block
Domain	Underwriting	Fraud prevention

📝 License

MIT License - Free for commercial and personal use.

🙏 Acknowledgments

Built by Pelz as part of a fintech portfolio demonstrating:

Real-time risk assessment
Stateful pattern detection
Geospatial analytics
Production-quality code
Comprehensive testing

Note: This is a demonstration project for educational/portfolio purposes. For production fraud detection, consider:

Professional fraud services (Sift, Riskified, Stripe Radar)
Machine learning models trained on your data
Legal review for compliance
Insurance for fraud losses

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
api.py		api.py
fraud_detector.py		fraud_detector.py
generate_test_data.py		generate_test_data.py
requirements.txt		requirements.txt
rules_engine.py		rules_engine.py
test_fraud_detector.py		test_fraud_detector.py
test_scenarios.json		test_scenarios.json
transaction_store.py		transaction_store.py

pelzade127/fraud-detector

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection Engine

🎯 Overview

Key Features

🚀 Quick Start (macOS)

Prerequisites

Run Examples

Run Tests

Start API Server

Generate Test Data

📊 How It Works

Architecture

Fraud Rules

Risk Levels

Feature Engineering

🛠️ API Usage

Assess Single Transaction

Batch Assessment

Register Webhook

Get User History

📈 Example Scenarios

Scenario 1: Legitimate Transaction

Scenario 2: Velocity Attack

Scenario 3: Impossible Travel

Scenario 4: Card Testing

Scenario 5: Device Sharing Fraud

🧪 Testing

Test Suite Coverage

Run Tests

🔧 Technical Implementation

New Technologies & Patterns

Code Structure

Performance Characteristics

💡 Production Considerations

Scaling

Security

Monitoring

Compliance

🎨 Future Enhancements

📚 Use Cases

Fintech Applications

🔍 How This Differs from Creditworthiness Scorer

📝 License

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages