fix(security): rate-limit + validate POST /api/feedback (#96) by voidborne-d · Pull Request #149 · tanweai/pua

voidborne-d · 2026-04-24T19:42:08Z

Fixes #96.

Summary

Issue #96 reports that POST /api/feedback (landing/functions/api/feedback.ts) accepts arbitrary JSON with no authentication, rate limiting, or size caps, so a single script can spam the D1 feedback table and exhaust storage via oversized session_data.

This change keeps the endpoint anonymous (the stop-feedback.sh hook calls it from bash with no session) but closes the abuse vectors in the issue:

Per-IP rate limit via a new feedback_rate_limits table — fixed 1-hour buckets, 30 POST/hour per CF-Connecting-IP, 5/hour when the header is missing so a stripping proxy can't be used as a bypass. Responses carry X-RateLimit-Limit / X-RateLimit-Remaining + Retry-After. Old rows are swept lazily on ~1% of writes so the table stays bounded without a cron trigger.
Size caps on the full body (2 MiB), session_data (1 MiB, well below upload.ts's 50 MB path), and each short field (rating / pua_level / flavor ≤ 64 B, task_summary ≤ 2 KiB). Checked via TextEncoder byte length — JS .length over-counts CJK.
Type + range validation on every field. Non-integer or out-of-range numerics (negative, > 100 000) return 400 with the offending field name instead of being silently coerced to 0.
Content-Length above the body cap short-circuits to 413 before the body is read, so the Workers process never pays the cost of an oversize upload.

Why no CSRF tokens

CORS is * by design — the skill's bash hooks call the API from anywhere and no cookies are read, so CSRF protection would be cargo-culting. The concrete attacks in the issue (database abuse, storage exhaustion, data pollution) are all covered by rate limit + size caps.

Impact on existing callers

The real payload from hooks/stop-feedback.sh ({"rating":"...","pua_count":0,"flavor":"阿里","task_summary":"..."} plus the optional session_data from the sanitized JSONL session) fits comfortably inside every cap — a full session_data upload runs ≈ tens of KiB in practice, 4+ orders of magnitude under the 1 MiB cap. GET aggregate-stats path is unchanged.

Schema

New migration landing/migrations/0003_feedback_rate_limit.sql creates the rate-limit table (composite PK on (ip, window_start) so upserts land atomically under SQLite semantics).

Changes

landing/functions/api/_ratelimit.ts (new) — D1-backed fixed-window limiter, lazy sweep.
landing/functions/api/feedback.ts — rewritten validation path, preserves the GET fall-through for empty/stripped bodies (custom-domain quirk documented in upload.ts).
landing/migrations/0003_feedback_rate_limit.sql (new).
landing/functions/api/feedback.test.ts (new) — 20 tests.
landing/src/test/setup.ts (new, 1 line) — vitest.config.ts already points here via setupFiles, so adding it fixes a latent config hole.

Test plan

All 20 tests green via npx vitest run (≈ 16 ms):

 ✓ functions/api/feedback.test.ts (20 tests)
     ✓ accepts a minimal valid body
     ✓ rejects missing rating
     ✓ rejects wrong types
     ✓ rejects arrays disguised as objects at handler level
     ✓ rejects oversize session_data (byte-length)
     ✓ rejects oversize task_summary
     ✓ enforces byte-length for CJK text
     ✓ rejects negative and out-of-range numerics
     ✓ accepts realistic hook payload
     ✓ inserts a valid feedback row and returns 200
     ✓ returns 400 for missing rating
     ✓ returns 400 for oversize session_data
     ✓ returns 413 when Content-Length exceeds MAX_BODY_BYTES
     ✓ returns 400 for invalid JSON
     ✓ returns 400 for arrays and non-objects
     ✓ rate-limits after RATE_LIMIT_PER_HOUR POSTs from same IP
     ✓ applies a stricter limit when CF-Connecting-IP is missing
     ✓ returns Retry-After on 429
     ✓ GET returns aggregate stats (legacy)
     ✓ OPTIONS preflight returns 204 with CORS headers
 Test Files  1 passed (1)
      Tests  20 passed (20)

Tests cover:

validateFeedback (pure): minimal-valid body, missing rating, type mismatches, oversize session_data / task_summary, CJK byte-length enforcement (25× 好 = 75 B overruns a 16 B cap even though .length is 25), negative / non-integer / out-of-range numerics, realistic hook payload.
onRequest vs an in-memory FakeDB: 200 path, 400 on bad body / missing rating / invalid JSON / arrays & non-objects, 413 on oversize Content-Length, rate-limit cap enforcement on the named-IP path (exactly 30 allowed, 3 extra blocked) and unknown-IP path (5 allowed), Retry-After + X-RateLimit-Limit headers on 429, GET aggregate stats preserved, OPTIONS preflight 204.

Before / after

Repro from issue #96 (spam loop with 50 KB session_data) before:

$ for i in $(seq 1 1000); do curl -s -X POST .../api/feedback ... ; done
# all 1000 succeed, 1000 rows inserted, ~50 MB stored

After:

# first request: 400 "session_data exceeds 1048576 bytes"
# or with smaller payload:
# requests 1-30: 200 {ok: true}
# request 31+: 429 {"error":"rate limit exceeded — try again later"}
#              Retry-After: 2837

Other gates

npm run build — clean (tsc -b && vite build both succeed, bundle size unchanged).
npm run lint on my three new/changed files — clean. The 7 remaining lint errors (_sanitize.ts:107:43, auth/github.ts:5:63, leaderboard.ts:124:9, etc.) are all pre-existing on files not touched here.

AI agent disclosure

Generated with Claude Code. Reviewer-check targets: rate-limit windowing math (fixed buckets, not sliding — deliberate trade-off documented in _ratelimit.ts), the Content-Length short-circuit (lying Content-Length is caught by the second-stage byte-length check), and the CF-Connecting-IP bypass mitigation.

Issue tanweai#96 reports that `POST /api/feedback` accepts arbitrary JSON with no authentication, rate limiting, or size caps, so a single script can spam the D1 `feedback` table and exhaust storage via oversized `session_data`. This change keeps the endpoint anonymous (hooks call it from bash without a session) but closes the abuse vectors: 1. **Per-IP rate limit** via a new `feedback_rate_limits` table — fixed 1-hour buckets, 30 POST/hour per `CF-Connecting-IP`, 5/hour when the header is missing so a stripping proxy can't be used as a bypass. Responses carry `X-RateLimit-*` + `Retry-After`; old rows are swept lazily (~1% of writes) so the table stays bounded without a cron. 2. **Size caps** on the full body (2 MiB), `session_data` (1 MiB), and each short field (rating/pua_level/flavor ≤ 64 B, task_summary ≤ 2 KiB). Checked via `TextEncoder` byte length — JS `.length` overcounts CJK. 3. **Type + range validation** on every field. Non-integer or out-of-range numerics (negative, > 100k) return 400 with the offending field name instead of silently coercing. 4. `Content-Length` above the body cap short-circuits to 413 before the body is read. CSRF tokens are intentionally NOT added: CORS is `*` by design (the skill's bash hooks call the API from anywhere, no cookies are read), so CSRF protection would be cargo-culting. Rate-limit + size caps address the concrete abuse paths in the issue. Migration `0003_feedback_rate_limit.sql` creates the tracking table (composite PK on `(ip, window_start)` so upserts land atomically). ## Test plan New `landing/functions/api/feedback.test.ts` — 20 tests, all green via `npx vitest run` (~16 ms): - `validateFeedback` unit tests: minimal-valid body, missing rating, type mismatches, oversize `session_data` / `task_summary`, CJK byte-length enforcement, negative/non-integer/out-of-range numerics. - `onRequest` integration tests against an in-memory FakeDB: 200 path, 400 on bad body / missing rating / invalid JSON / arrays and non-objects, 413 on oversize Content-Length, rate-limit cap enforcement on both the named-IP path (30/hr) and unknown-IP path (5/hr), `Retry-After` header on 429, GET aggregate stats preserved, OPTIONS preflight 204. - `npm run build` — clean (`tsc -b && vite build` both succeed, bundle size unchanged). - `npm run lint` on the three new/changed files — clean (pre-existing errors on `_sanitize.ts` / `leaderboard.ts` / `auth/github.ts` are not from this PR). Reproduction from the issue (1000× spam loop with 50 KB `session_data`) now gets one 400 (session_data too large) from the first request, then 429 responses once the hour bucket fills. Generated with Claude Code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): rate-limit + validate POST /api/feedback (#96)#149

fix(security): rate-limit + validate POST /api/feedback (#96)#149
voidborne-d wants to merge 1 commit intotanweai:mainfrom
voidborne-d:fix/feedback-security

voidborne-d commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

voidborne-d commented Apr 24, 2026

Summary

Why no CSRF tokens

Impact on existing callers

Schema

Changes

Test plan

Before / after

Other gates

AI agent disclosure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant