LLM Output Sanitization Checker for Python
Static analyzer that detects when LLM output is used in dangerous operations without proper validation or sanitization.
LLM responses are untrusted input. When your code does:
response = client.chat.completions.create(...)
content = response.choices[0].message.content
# DANGER: LLM output flows to dangerous sink
eval(content) # Code execution
cursor.execute(f"SELECT * FROM users WHERE name = '{content}'") # SQL injection
os.system(content) # Command injectionThese are the same vulnerabilities we learned to avoid with user inputβbut developers often forget that LLM output is equally untrusted.
llmoutput performs taint analysis:
- Identifies LLM API calls (OpenAI, Anthropic, LangChain, etc.)
- Tracks response variables through assignments and transformations
- Flags dangerous sinks where tainted data flows without sanitization
pip install llmoutputOr run directly (zero dependencies):
curl -O https://raw.githubusercontent.com/kriskimmerle/llmoutput/main/llmoutput.py
python llmoutput.py app.py# Scan a file
llmoutput app.py
# Scan a directory
llmoutput src/
# CI mode
llmoutput . --check --min-score 70
# JSON output
llmoutput . --format json======================================================================
llmoutput - LLM Output Sanitization Checker
======================================================================
Files scanned: 3
LLM API calls found: 5
Unsafe output uses: 2
Security Score: 40/100 (Grade: D)
Findings by severity:
CRITICAL: 2
----------------------------------------------------------------------
π app.py
π΄ Line 28: [LO01] CRITICAL
LLM output used in eval() - arbitrary code execution
Source: Line 15: client.chat.completions.create
Sink: eval
Fix: Avoid using eval/exec on LLM output. Use ast.literal_eval() for data.
π΄ Line 35: [LO02] CRITICAL
LLM output in SQL f-string - SQL injection vulnerability
Source: Line 15: client.chat.completions.create
Sink: SQL query
Fix: Use parameterized queries: cursor.execute('SELECT * FROM t WHERE x = ?', (llm_output,))
======================================================================
β οΈ LLM output used unsafely - sanitize before dangerous operations!
| Rule | Severity | Description |
|---|---|---|
| LO01 | CRITICAL | LLM output in eval/exec/compile |
| LO02 | CRITICAL | LLM output in SQL queries |
| LO03 | CRITICAL/HIGH | LLM output in shell commands |
| LO04 | CRITICAL | LLM output in pickle/yaml deserialization |
| LO05 | HIGH/MEDIUM | LLM output in file paths |
| LO06 | HIGH/MEDIUM | LLM output in HTML templates |
| LO07 | LOW | LLM output parsed as JSON (informational) |
- OpenAI:
openai.chat.completions.create,client.chat.completions.create - Anthropic:
anthropic.messages.create,client.messages.create - LangChain:
llm.invoke,chain.run,ChatOpenAI.invoke - LlamaIndex:
query_engine.query - Generic patterns:
generate_response,get_completion,ask_llm
usage: llmoutput [-h] [--format {text,json}] [--check] [--min-score MIN_SCORE]
[--exclude PATH] [--verbose] [--version]
[path]
positional arguments:
path File or directory to scan (default: .)
options:
-h, --help show this help message and exit
--format, -f {text,json}
Output format
--check Exit with code 1 if score below threshold
--min-score MIN_SCORE Minimum score for --check (default: 60)
--exclude, -e PATH Paths to exclude (can repeat)
--verbose, -v Verbose output
--version, -V Show version
- name: Check LLM output handling
run: |
pip install llmoutput
llmoutput . --check --min-score 70repos:
- repo: local
hooks:
- id: llmoutput
name: LLM output safety check
entry: llmoutput
language: python
types: [python]
args: [--check]llmoutput recognizes sanitization patterns and won't flag properly validated code:
# β
SAFE: Using ast.literal_eval for data parsing
import ast
data = ast.literal_eval(llm_response)
# β
SAFE: Parameterized SQL query
cursor.execute("SELECT * FROM users WHERE name = ?", (llm_response,))
# β
SAFE: HTML escaping
from html import escape
safe_html = escape(llm_response)
# β
SAFE: Path validation
import os.path
filename = os.path.basename(llm_response) # Strips path traversal| Tool | Type | When | What |
|---|---|---|---|
| Guardrails-AI | Runtime | Execution time | Validates actual LLM responses |
| llmoutput | Static | Build time | Checks if code handles output safely |
Use both: llmoutput in CI/CD to catch missing validation, Guardrails-AI at runtime to validate actual content.
from llmoutput import scan_file, scan_directory, calculate_score
# Scan a file
result = scan_file(Path("app.py"))
print(f"LLM calls found: {result.llm_calls_found}")
for finding in result.findings:
print(f"{finding.rule}: {finding.message}")
# Scan a directory
results = scan_directory(Path("src/"))
all_findings = [f for r in results for f in r.findings]
score = calculate_score(all_findings)
print(f"Score: {score}/100")MIT License - see LICENSE for details.