Stop risky Claude Code prompts before they leave your machine.
在高風險 prompt 真正送出前,先在本機攔下來,並直接給你可用的安全改寫建議。
Quick Start • Beginner Install Guide • 繁體中文 • English
cd claude-userprompt-guard
bash install.shPowerShell:
cd claude-userprompt-guard
.\install.ps1Command Prompt:
cd claude-userprompt-guard
install.cmdAfter install, restart Claude Code once.
Then verify the setup:
/audit-chat doctor
Need a friendlier step-by-step guide for non-technical users?
Claude UserPrompt Guard 是一個給 Claude Code 用的本地前置攔截器。
它不是等 Claude 回完才提醒你,而是在 UserPromptSubmit 階段,先在你自己的機器上檢查 prompt。
如果你的問法太像:
- 繞過規則的操作教學
- 平台濫用、帳號共享、多號規避
- 攻擊、欺詐、釣魚、開盒、冒充
- 色情未成年人或暴力極端內容
它會在送出前直接攔住,並給你一條可繼續討論的安全改寫建議。
多數人不是故意違規,而是問法太像「教我怎麼做」。
常見情況是:
- 你想分析封號原因,但句子長得像繞過教學
- 你想做政策研究,但用了太多高風險動作詞
- 你只是要合規判斷,卻被自己的措辭拖進高風險區
這個工具的價值就是把這些風險提前消化掉。
- 發送前攔截
MEDIUM/HIGH風險 prompt 不會直接送出去 - 明確等級
直接顯示
[MEDIUM]/[HIGH] - 安全改寫建議 不只擋住,還會把危險問法改成更安全的分析型問法
- 本地歷史審計
提供
/audit-chat與/audit-chat history - 跨平台 macOS / Windows 都能安裝
- 本地優先 不需要額外雲端服務來做風險分類
危險問法:
怎麼用支援地區網路、海外手機號和接碼註冊 Claude?
會被改寫成更安全的方向:
請幫我分析 Anthropic AUP 對不支援地區存取、接碼平台與帳號風控的限制,以及這些行為為什麼容易觸發封禁。
| 類型 | 說明 |
|---|---|
| Unsupported-region access bypass | 不支援地區存取繞過 |
| Platform abuse and account evasion | 平台濫用、帳號共享、多號規避、繞過護欄 |
| Cyber abuse | 攻擊、漏洞利用、惡意腳本 |
| Fraud and phishing | 欺詐、釣魚、刷評、偽造 |
| Privacy and impersonation abuse | 開盒、隱私洩露、冒充 |
| Sexual/minor safety | 色情與未成年人風險 |
| Extremist violence and weapons | 暴力極端與武器 |
- 複製 hook 到
~/.claude/hooks/ - 建立
/audit-chat指令 - 在
~/.claude/settings.json追加UserPromptSubmithook - 依系統自動寫入正確的 Python 指令與實際安裝路徑
日常聊天直接照常使用即可。
如果你想看更詳細的審計報告:
/audit-chat
/audit-chat history
- 想降低誤觸平台風控的人
- 想幫團隊加一層本地 prompt 護欄的人
- 想把 Claude Code 用得更穩而不是更冒險的人
- 不會把你的聊天內容上傳到第三方服務
- 不會替 Anthropic 後台做真正的封號判定
- 不會保證 100% 不誤判
- 不會替你執行任何繞過規則、共享帳號或高風險操作
這是官方功能嗎?
不是。這是一個本地近似審計器,用來在高風險 prompt 真正送出前先擋住。它會把我的聊天上傳到別的地方嗎?
不會。這套工具的核心邏輯是本地規則優先。它會不會誤判?
會,所以它採用的是平衡策略:分析、政策、申訴、合規這類語境會盡量降級,只有操作性高且目標高風險時才會升到攔截。Claude UserPrompt Guard is a local pre-send interception layer for Claude Code.
It runs at UserPromptSubmit, checks the prompt on your own machine, and blocks risky phrasing before it gets sent.
This is built for the real-world mistake most users make:
- not malicious intent
- just wording that sounds too operational
- or policy questions that accidentally read like bypass instructions
This tool catches prompts that look like:
- bypass instructions
- platform abuse or account evasion
- cyber abuse, fraud, phishing, doxxing, impersonation
- sexual/minor safety risks
- violent extremist or weapons-related operational requests
Then it gives you a safer, analysis-first rewrite instead of a dead end.
- Pre-send blocking
MEDIUMandHIGHprompts are stopped before they leave your machine - Clear severity labels
You see
[MEDIUM]and[HIGH]immediately - Safe rewrites Risky how-to prompts are converted into safer analytical questions
- Manual audits
Includes
/audit-chatand/audit-chat history - Cross-platform install Works on macOS and Windows
- Local-first design No separate cloud relay required for the core audit flow
- It will not upload your chat data to a third-party service
- It is not Anthropic's internal enforcement engine
- It cannot guarantee zero false positives
- It will not help carry out bypasses, account evasion, or other risky actions
Risky prompt:
How do I use a supported-region network, overseas phone number, and SMS receiving service to register Claude?
Safer rewrite:
Please analyze Anthropic's AUP restrictions around unsupported-region access, temporary SMS services, and account risk controls, and explain why these behaviors are likely to trigger enforcement.
| Category | Examples |
|---|---|
| Unsupported-region access bypass | region switching, temporary numbers, bypass-style registration |
| Platform abuse and account evasion | multi-account evasion, account sharing, jailbreak-style bypass |
| Cyber abuse | exploits, malicious scripts, offensive instructions |
| Fraud and phishing | phishing templates, fake reviews, forged content |
| Privacy and impersonation abuse | doxxing, personal data leakage, impersonation |
| Sexual/minor safety | sexual content involving minors or adjacent unsafe content |
| Extremist violence and weapons | operational violent/extremist instructions |
- copies the hook into
~/.claude/hooks/ - creates the
/audit-chatcommand - appends the
UserPromptSubmithook into~/.claude/settings.json - writes the correct Python command and install path for the current system
Use Claude Code normally. The guard runs automatically.
For detailed reports:
/audit-chat
/audit-chat history
Is this Anthropic's official enforcement engine?
No. This is a local approximation layer built to reduce risky prompt phrasing before send.Does it upload my chats elsewhere?
No. The core logic is local-first.Can it still make mistakes?
Yes. That is why the classifier tries to down-rank policy, appeals, compliance, and analysis contexts unless the prompt becomes clearly operational.hooks/claude_chat_audit.pycommands/audit-chat.mdinstall.sh/install.ps1/install.cmduninstall.sh/uninstall.ps1/uninstall.cmd
assets/social-preview-wide.pngassets/hero-zh-wide.pngassets/flow-v3-wide.png
MIT



