debtector scans codebases for AI-generated technical debt patterns specifically, scoring and reporting them so teams can catch debt before it compounds.
Per the 2025 Stack Overflow Developer Survey, the #1 frustration for developers is "AI solutions that look correct but are slightly wrong." Ars Technica (Jan 2026) reports growing concern about AI coding agents "building up technical debt — making poor design choices early that snowball into worse problems over time."
The problem: AI code generators produce code that works but accumulates subtle debt patterns — copy-paste duplication with slight variations, inconsistent error handling, over-engineered abstractions, dead code, naming inconsistencies, and "looks right but isn't idiomatic" patterns. Existing linters catch syntax issues but miss these higher-level AI-specific debt patterns.
debtector addresses this gap by specifically scanning for AI-generated technical debt patterns, providing early warning before they compound.
- Dev teams using AI coding assistants (Copilot, Claude Code, Cursor, etc.)
- Tech leads doing code review on AI-assisted PRs
- Solo developers who want a second opinion on AI-generated code quality
# Install from source
git clone https://github.com/debtector/debtector.git
cd debtector
pip install -e .
# Or install from PyPI (when published)
pip install debtector# Scan current directory
debtector scan .
# Scan with JSON output
debtector scan ./src --format json
# Fail CI if score > 60
debtector scan . --threshold 60
# Pretty terminal report with colors
debtector report .🔍 DEBTECTOR REPORT
Overall Debt Score: 73.2/100
📊 Summary
Files Analyzed: 12
Total Issues: 28
AI-Generated Files: 7
🏷️ Category Breakdown
🔄 Duplication: 85.3 (HIGH)
❌ Error Handling: 67.1 (MEDIUM)
🌀 Complexity: 45.8 (MEDIUM)
💀 Dead Code: 23.4 (LOW)
🔥 Files Needing Attention
📁 ai_service.py - Score: 89.2 | Issues: 8 | AI: 94%
• HIGH: Functions 'process_user_data' and 'handle_user_data' are 91% similar
• MEDIUM: Inconsistent error handling patterns across functions
Near-duplicate functions: Functions that are 70-95% similar (AI loves generating slight variations instead of abstracting)
# ❌ AI-generated debt pattern
def process_user_data(user_data):
result = {}
if user_data:
result['name'] = user_data.get('name', '')
result['processed'] = True
return result
def process_admin_data(admin_data): # 91% similar!
result = {}
if admin_data:
result['name'] = admin_data.get('name', '')
result['processed'] = True
return resultCopy-paste with mutations: Similar code blocks with small parameter changes
Inconsistent error handling: Mix of try/catch styles, some functions handle errors, others don't
Over-abstraction: Unnecessary wrapper functions, classes with single methods
Generic naming: AI's love for result, data, response, output variables
- 0-100 scale: Higher = more debt
- Category breakdown: Duplication, error handling, complexity, dead code, naming, AI markers
- Severity levels: Low/Medium/High/Critical
- AI confidence: Estimates likelihood code was AI-generated
- Python (full support)
- JavaScript/TypeScript (full support)
- Pluggable architecture for adding languages
# Watch mode - re-scan on changes
debtector watch . --interval 5
# Compare debt between commits
debtector diff HEAD~5
# CI/CD integration
debtector scan . --threshold 60 --format json
# Custom output file
debtector scan . --output report.jsonCreate .debtector.yaml in your project root:
# Supported languages
languages: [python, javascript, typescript]
# Ignore patterns (glob-style)
ignore:
- "tests/**"
- "node_modules/**"
- "*.min.js"
# Score thresholds per category
thresholds:
duplication: 70
error_handling: 60
dead_code: 50
naming: 40
complexity: 80
overall: 60
# Custom patterns (regex)
custom_patterns:
- name: "hardcoded-urls"
regex: "https?://[^\"'\\s]+"
severity: medium
message: "Hardcoded URL detected"
# File size limit (bytes)
max_file_size: 1048576 # 1MB
# AI confidence threshold
exclude_ai_confidence_threshold: 0.3- Duplication: 25% (most important for AI-generated code)
- Complexity: 25% (over-engineering is common)
- Error Handling: 20% (inconsistency is a key AI pattern)
- Dead Code: 15% (AI often generates unused code)
- Naming: 10% (affects readability)
- AI Markers: 5% (meta-indicator)
- Critical: 10x weight
- High: 5x weight
- Medium: 2.5x weight
- Low: 1x weight
Files with >70% AI confidence get 20% score penalty (AI code needs extra scrutiny).
name: Technical Debt Check
on: [push, pull_request]
jobs:
debt-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install debtector
run: pip install debtector
- name: Scan for debt
run: debtector scan . --threshold 70 --format json# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: debtector
name: debtector debt scan
entry: debtector scan --threshold 80
language: system
pass_filenames: falsedebtector/
├── analyzers/ # Debt pattern detectors
│ ├── duplication.py # Near-duplicate detection
│ ├── error_handling.py # Error pattern analysis
│ ├── dead_code.py # Unused code detection
│ ├── naming.py # Naming convention analysis
│ ├── complexity.py # Over-abstraction detection
│ └── ai_markers.py # AI-generated code heuristics
├── parsers/ # Language-specific parsers
│ ├── python_parser.py # Python AST analysis
│ └── js_parser.py # JavaScript parsing
├── reporters/ # Output formatters
│ ├── terminal.py # Rich terminal output
│ ├── json_reporter.py # JSON/CI output
│ └── diff_reporter.py # Git diff analysis
├── scanner.py # Main orchestrator
├── scoring.py # Debt scoring engine
├── config.py # Configuration management
└── history.py # Trend tracking
- Near-duplicates (70-95% similar functions)
- Copy-paste mutations (same logic, different variable names)
- Structural similarity (same flow, different details)
- Mixed patterns (try/catch + callbacks + promise.catch in same file)
- Missing error handling (risky operations without protection)
- Overly broad exceptions (bare except, catching Exception)
- God functions (doing too many different things)
- Single-method classes (unnecessary abstraction)
- Wrapper functions (just calling another function)
- Deep nesting (if/for/while pyramids)
- Generic variable names (result, data, response, output)
- Verbose explanatory comments ("This function does...")
- Boilerplate-heavy structure (unnecessary initialization)
- Perfect but generic naming (user_input, api_response)
We welcome contributions! Here's how to add new analyzers:
- Create parser in
parsers/new_language_parser.py - Update analyzers to support the new AST format
- Add file extension to
config.py - Write tests with realistic fixtures
- Inherit from BaseAnalyzer:
from .base import BaseAnalyzer
class MyAnalyzer(BaseAnalyzer):
def get_analyzer_name(self) -> str:
return "my_pattern"
def get_supported_extensions(self) -> List[str]:
return [".py", ".js"]
def analyze_file(self, file_path, content, parsed_ast):
issues = []
# Your detection logic here
return issues- Add to scanner.py analyzer list
- Write comprehensive tests
- Update documentation
# Install development dependencies
pip install -e .[dev]
# Run all tests
pytest
# Run with coverage
pytest --cov=debtector --cov-report=html
# Run specific test
pytest tests/test_duplication.py -v- Java/C# support
- IDE integrations (VS Code extension)
- Web dashboard for team debt tracking
- Advanced ML models for AI detection
- Automatic refactoring suggestions
- Integration with AI coding tools
- Team collaboration features
- Historical debt analysis
- Real-time scanning during development
- AI-powered code review assistant
- Cross-repository debt tracking
- Predictive debt modeling
A: Traditional linters catch syntax and style issues. debtector focuses on higher-level patterns specific to AI-generated code — things like near-duplicates, inconsistent abstractions, and "looks right but isn't idiomatic" patterns.
A: No. All analysis is local and deterministic. No AI models or cloud services required.
A: Minimal. Scanning a 50K line codebase takes ~10-30 seconds on modern hardware.
A: Yes, via .debtector.yaml. You can adjust category weights, severity thresholds, and add custom regex patterns.
A: Yes, but it's optimized for AI-generated patterns. You might want to adjust thresholds for older codebases written entirely by humans.
A: The AI confidence score is heuristic-based (comment patterns, naming conventions, code structure). It's ~70-80% accurate for obvious AI-generated code, but should be used as a hint rather than definitive judgment.
MIT License. See LICENSE for details.
Created in response to growing concerns about AI-generated technical debt in software development. Special thanks to the Stack Overflow and Ars Technica reports that highlighted this problem.
Built with ❤️ for developers dealing with AI code debt.
Found a bug or have a suggestion? Open an issue or contribute!