feat(m6,ct): M6 更名為「暗網帳號密碼外洩」並新增相似憑證統計章節#1
Open
like19970403 wants to merge 1 commit intoastroicers:mainfrom
Open
feat(m6,ct): M6 更名為「暗網帳號密碼外洩」並新增相似憑證統計章節#1like19970403 wants to merge 1 commit intoastroicers:mainfrom
like19970403 wants to merge 1 commit intoastroicers:mainfrom
Conversation
M6 改名
- module_name: 暗網憑證外洩 → 暗網帳號密碼外洩(中文「憑證」易與 TLS 憑證混淆)
- finding title: 外洩憑證 → 外洩帳號密碼
- 同步更新 scoring/weights.py、測試、CHANGELOG、ADR-002、ADR-003
M6 COMB 查詢修正(ProxyNova 誤報 10000 問題)
- _check_credential_leaks 改為對 lines 做 email suffix 過濾,不再盲信 count
- 回傳 3-tuple (count, matched_lines, error),把原始 email:password 帶出
- Finding.evidence 改存明文外洩資料供報告展示
新增相似憑證統計章節(資訊性,不計入評分)
- models: 新增 Certificate dataclass + Assets.certificates 欄位與統計屬性
- discovery: 新增 query_crtsh_certificates(),抓 CT metadata(crt_id / 時間 /
CN / SANs / issuer)
- crt.sh 雙 endpoint fallback:JSON 掛時改走 HTML table scrape,解決 crt.sh
後端 output=json 偶發 502 時完全拿不到資料的問題
- pipeline Step 1c 將憑證塞進 Assets
- report.html 新增「相似憑證統計(Certificate Transparency)」章節:統計卡
(總數 / Wildcard / 最近 / 最早)+ Issuer 分布表 + 憑證明細表
- 章節位置置於「資產清單」之後,作為攻擊面情資佐證
報告版面改善
- Finding 的 evidence 若為多行(如外洩帳密清單)自動渲染為可展開的
<details> 區塊,方便客戶直接取得明文
- 資產清單加上免責說明:被動情資來源可能回報已下線、內部 only、遷移殘留等
情境,欄位「-」代表該項掃描當下無資料
- 憑證明細表改為 table-layout: fixed + word-break,避免 PDF 窄欄位被截斷
- @media print 全文改 A4 landscape,統一印出橫向版面
測試
- 新增 test_html_fallback_when_json_502:驗證 JSON 掛掉時 HTML parser 能
正確解析 crt.sh 表格
- 更新 _mock_comb_response 使 lines 反映真實過濾後筆數
- test_check_credential_leaks_filters_by_email_suffix 驗證 substring 誤報
被濾除且 matched_lines 保留原始行
全套 382 tests passed
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
本 PR 解決兩個先前容易誤導客戶的問題,並新增一個資訊性章節:
ProxyNova的 count 是 substring match 全集(最大 10000),含大量與目標 domain 無關的誤報。實測net-chinese.com.tw假 10000 實際只有 2 筆真實@domain命中。改為對lines做 email suffix 過濾。變更內容
M6 更名
darkweb.pymodule_name、finding title 字串scoring/weights.pyCHANGELOG.md、ADR-002、ADR-003、測試M6 COMB 過濾修正
_check_credential_leaks改為對lines做 email@domainsuffix 過濾(count, matched_lines, error),原始email:password帶到Finding.evidence,方便客戶直接取得明文重設帳密相似憑證統計(新功能)
models/assets.py:新增Certificatedataclass +Assets.certificates欄位與統計屬性(total_certificates/wildcard_certificates/latest_certificate/earliest_certificate/issuer_stats)discovery/web_sources.py:新增query_crtsh_certificates()回傳 CT metadataoutput=json)是本專案實戰時觀察到的不穩來源,經常 502。新增 HTML table scrape fallback(Identity=%.<domain>與q=<domain>&group=none兩條 HTML 路徑),JSON 掛掉時仍能取得完整憑證資料discovery/pipeline.py:新增 Step 1cct_certificates,把資料塞進Assets報告版面改善
evidence若含多行自動渲染為可展開<details>(紅色樣式,適合展示外洩明文)table-layout: fixed+word-break,避免 PDF 窄欄位被截@media print全文改 A4 landscape,客戶 PDF 不再被 portrait 窄版截Test plan
test_html_fallback_when_json_502)user@net-chinese.com.tw:password明文Known follow-ups
query_crtsh_certificates目前若 JSON/HTML 都掛會回空,未做 cross-scan 快取 fallback。可加「若本次為空就沿用上次 scan 結果」。🤖 Generated with Claude Code