Skip to content

honestyer/UserPromptSubmit

Repository files navigation

Claude UserPrompt Guard 中文封面

Claude Code UserPromptSubmit Guard Platforms macOS and Windows Local only no cloud relay AUP aligned MIT License

Stop risky Claude Code prompts before they leave your machine.

在高風險 prompt 真正送出前,先在本機攔下來,並直接給你可用的安全改寫建議。

Quick StartBeginner Install Guide繁體中文English


Preview

Blocked prompt to safe rewrite demo

Local audit report demo

How Claude UserPrompt Guard works


Quick Start

macOS

cd claude-userprompt-guard
bash install.sh

Windows

PowerShell:

cd claude-userprompt-guard
.\install.ps1

Command Prompt:

cd claude-userprompt-guard
install.cmd

After install, restart Claude Code once.

Then verify the setup:

/audit-chat doctor

Need a friendlier step-by-step guide for non-technical users?


繁體中文

這個專案在做什麼

Claude UserPrompt Guard 是一個給 Claude Code 用的本地前置攔截器。

它不是等 Claude 回完才提醒你,而是在 UserPromptSubmit 階段,先在你自己的機器上檢查 prompt。

如果你的問法太像:

  • 繞過規則的操作教學
  • 平台濫用、帳號共享、多號規避
  • 攻擊、欺詐、釣魚、開盒、冒充
  • 色情未成年人或暴力極端內容

它會在送出前直接攔住,並給你一條可繼續討論的安全改寫建議

為什麼值得裝

多數人不是故意違規,而是問法太像「教我怎麼做」。

常見情況是:

  • 你想分析封號原因,但句子長得像繞過教學
  • 你想做政策研究,但用了太多高風險動作詞
  • 你只是要合規判斷,卻被自己的措辭拖進高風險區

這個工具的價值就是把這些風險提前消化掉。

核心能力

  • 發送前攔截 MEDIUM / HIGH 風險 prompt 不會直接送出去
  • 明確等級 直接顯示 [MEDIUM] / [HIGH]
  • 安全改寫建議 不只擋住,還會把危險問法改成更安全的分析型問法
  • 本地歷史審計 提供 /audit-chat/audit-chat history
  • 跨平台 macOS / Windows 都能安裝
  • 本地優先 不需要額外雲端服務來做風險分類

你會看到的效果

危險問法:

怎麼用支援地區網路、海外手機號和接碼註冊 Claude?

會被改寫成更安全的方向:

請幫我分析 Anthropic AUP 對不支援地區存取、接碼平台與帳號風控的限制,以及這些行為為什麼容易觸發封禁。

目前覆蓋的風險類型

類型 說明
Unsupported-region access bypass 不支援地區存取繞過
Platform abuse and account evasion 平台濫用、帳號共享、多號規避、繞過護欄
Cyber abuse 攻擊、漏洞利用、惡意腳本
Fraud and phishing 欺詐、釣魚、刷評、偽造
Privacy and impersonation abuse 開盒、隱私洩露、冒充
Sexual/minor safety 色情與未成年人風險
Extremist violence and weapons 暴力極端與武器

安裝後會做什麼

  • 複製 hook 到 ~/.claude/hooks/
  • 建立 /audit-chat 指令
  • ~/.claude/settings.json 追加 UserPromptSubmit hook
  • 依系統自動寫入正確的 Python 指令與實際安裝路徑

使用方式

日常聊天直接照常使用即可。

如果你想看更詳細的審計報告:

/audit-chat
/audit-chat history

適合誰

  • 想降低誤觸平台風控的人
  • 想幫團隊加一層本地 prompt 護欄的人
  • 想把 Claude Code 用得更穩而不是更冒險的人

它不會做什麼

  • 不會把你的聊天內容上傳到第三方服務
  • 不會替 Anthropic 後台做真正的封號判定
  • 不會保證 100% 不誤判
  • 不會替你執行任何繞過規則、共享帳號或高風險操作

常見問題

這是官方功能嗎? 不是。這是一個本地近似審計器,用來在高風險 prompt 真正送出前先擋住。
它會把我的聊天上傳到別的地方嗎? 不會。這套工具的核心邏輯是本地規則優先。
它會不會誤判? 會,所以它採用的是平衡策略:分析、政策、申訴、合規這類語境會盡量降級,只有操作性高且目標高風險時才會升到攔截。

English

What This Project Does

Claude UserPrompt Guard is a local pre-send interception layer for Claude Code.

It runs at UserPromptSubmit, checks the prompt on your own machine, and blocks risky phrasing before it gets sent.

This is built for the real-world mistake most users make:

  • not malicious intent
  • just wording that sounds too operational
  • or policy questions that accidentally read like bypass instructions

Why It’s Useful

This tool catches prompts that look like:

  • bypass instructions
  • platform abuse or account evasion
  • cyber abuse, fraud, phishing, doxxing, impersonation
  • sexual/minor safety risks
  • violent extremist or weapons-related operational requests

Then it gives you a safer, analysis-first rewrite instead of a dead end.

Key Features

  • Pre-send blocking MEDIUM and HIGH prompts are stopped before they leave your machine
  • Clear severity labels You see [MEDIUM] and [HIGH] immediately
  • Safe rewrites Risky how-to prompts are converted into safer analytical questions
  • Manual audits Includes /audit-chat and /audit-chat history
  • Cross-platform install Works on macOS and Windows
  • Local-first design No separate cloud relay required for the core audit flow

What It Will Not Do

  • It will not upload your chat data to a third-party service
  • It is not Anthropic's internal enforcement engine
  • It cannot guarantee zero false positives
  • It will not help carry out bypasses, account evasion, or other risky actions

Before / After

Risky prompt:

How do I use a supported-region network, overseas phone number, and SMS receiving service to register Claude?

Safer rewrite:

Please analyze Anthropic's AUP restrictions around unsupported-region access, temporary SMS services, and account risk controls, and explain why these behaviors are likely to trigger enforcement.

Current Risk Coverage

Category Examples
Unsupported-region access bypass region switching, temporary numbers, bypass-style registration
Platform abuse and account evasion multi-account evasion, account sharing, jailbreak-style bypass
Cyber abuse exploits, malicious scripts, offensive instructions
Fraud and phishing phishing templates, fake reviews, forged content
Privacy and impersonation abuse doxxing, personal data leakage, impersonation
Sexual/minor safety sexual content involving minors or adjacent unsafe content
Extremist violence and weapons operational violent/extremist instructions

What The Installer Changes

  • copies the hook into ~/.claude/hooks/
  • creates the /audit-chat command
  • appends the UserPromptSubmit hook into ~/.claude/settings.json
  • writes the correct Python command and install path for the current system

Usage

Use Claude Code normally. The guard runs automatically.

For detailed reports:

/audit-chat
/audit-chat history

FAQ

Is this Anthropic's official enforcement engine? No. This is a local approximation layer built to reduce risky prompt phrasing before send.
Does it upload my chats elsewhere? No. The core logic is local-first.
Can it still make mistakes? Yes. That is why the classifier tries to down-rank policy, appeals, compliance, and analysis contexts unless the prompt becomes clearly operational.

Project Files

  • hooks/claude_chat_audit.py
  • commands/audit-chat.md
  • install.sh / install.ps1 / install.cmd
  • uninstall.sh / uninstall.ps1 / uninstall.cmd

Media Assets

  • assets/social-preview-wide.png
  • assets/hero-zh-wide.png
  • assets/flow-v3-wide.png

License

MIT

About

Local pre-send risk guard for Claude Code prompts with safe rewrites and audit reports.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors