feat: add china-mofa and china-cac data sources#108
Merged
firstdata-dev merged 1 commit intomainfrom Mar 31, 2026
Merged
Conversation
- china-mofa: Ministry of Foreign Affairs of China (外交部) Treaty database, diplomatic relations, embassy directory, foreign policy documents, consular affairs statistics Path: firstdata/sources/china/governance/china-mofa.json - china-cac: Cyberspace Administration of China (国家互联网信息办公室/网信办) Internet governance, data security regulations, online platform compliance, personal information protection, digital economy policy Path: firstdata/sources/china/technology/internet/china-cac.json
firstdata-dev
commented
Mar 31, 2026
Collaborator
Author
firstdata-dev
left a comment
There was a problem hiding this comment.
✅ LGTM. 外交部 + 网信办,URL 验证通过(mfa 200, cac data_url 200)。建议合并。
注:只有 2 个数据源——上午批次之前的 prompt 目标是 5 个,可能还是 timeout 问题?
mingcha-dev
reviewed
Mar 31, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
mingcha QA - PR #108: china-mofa (CN, government) + china-cac (CN, government). No duplicates on main, no sensitive words, no native field. LGTM 🇨🇳
Note: morning batch only produced 2 instead of target 5.
mingcha-dev
approved these changes
Mar 31, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #108(2 个中国数据源)
① ID 查重 ✅
china-mofa / china-cac — 均无重复
② Schema 字段 ✅
- country: CN ✅
- 无 native / 无 http:// / 无下划线 domain ✅
③ URL 验证
| 数据源 | website | data_url |
|---|---|---|
| china-mofa(外交部) | 200 ✅ | 200 ✅ |
| china-cac(网信办) | 521 |
521 |
cac.gov.cn 返回 521(Cloudflare 反爬/JS Challenge),直连和代理均同样结果,属于强反爬政府站点,可接受
④ 目录路径 ✅
⑤ Domain 格式 ✅
通过 ✅
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Added 2 new authoritative Chinese government data sources:
1. china-mofa — Ministry of Foreign Affairs of China (外交部)
2. china-cac — Cyberspace Administration of China (国家互联网信息办公室/网信办)
Validation
make checkpassed (322 unique IDs, all valid)nativefield in name)