Skip to content

feat: Phase 3 — production hardening (closes all P1 audit items)#5

Merged
fbmoulin merged 5 commits intomainfrom
feat/phase3-prod-hardening
Apr 27, 2026
Merged

feat: Phase 3 — production hardening (closes all P1 audit items)#5
fbmoulin merged 5 commits intomainfrom
feat/phase3-prod-hardening

Conversation

@fbmoulin
Copy link
Copy Markdown
Owner

@fbmoulin fbmoulin commented Apr 27, 2026

Summary

Phase 3 of ROADMAP. Closes all remaining P1 findings (E, F, I) plus 4 P2 items.

3 atomic commits:

Batch 1 (security)aa2fa2b

  • gunicorn 21.2 → 22.0 (CVE-2024-1135 HTTP request smuggling)
  • Drop force=True, return 415 on non-application/json content-type (P2-3)
  • _strip_query() removes ?… and #… from logged URLs before structlog (P1-I)
  • ANSI/control-char sanitization in _truncate (P2-4)
  • New pip-audit CI job (osv vuln service, soft gate)
  • +7 tests

Batch 2 (DoS)07fa6b7

  • Browser logger queue cap (200, drops oldest via shift) — L3
  • Three-pass scroll wall-clock budget KCD_MAX_SCROLL_S=120 (P2-2)
  • Asset disk caps KCD_MAX_TOTAL_MB=200 + KCD_MAX_ASSETS=500 (P1-E)
  • Manifest gains scroll_budget_exceeded, asset_caps_dropped, total_asset_bytes

Batch 3 (BS4 + rate limit)12ebcbe

  • rewrite_html_assets refactored to BeautifulSoup walker — only URL-bearing attrs + url() in styles, never script/comment text (P1-F). +5 regression tests.
  • Flask-Limiter on /api/client-errors @ 60/min/IP, RATE_LIMIT_STORAGE_URI for redis (P2-5)
  • RATELIMIT_ENABLED=False in test fixtures
  • Docs: AUDIT marks all P1 ✅ RESOLVED; CLAUDE collapses known-issues

Status of audit

Severity Total Resolved Open
P0 0
P1 9 9 (all) 0
P2 12 4 (3, 4, 5, 6) 8
P3 ~13 ~13

Test plan

  • pytest -q → 74 passed in 1.10s (was 62, +12 across 3 batches)
  • ruff check + format --check clean
  • gunicorn==22.0.0 resolves; flask-limiter==4.1.1 installs
  • CI green (Lint + Smoke + pytest + pip-audit)
  • Bot reviews
  • Manual: rate-limit smoke (for i in {1..70}; do curl -X POST /api/client-errors ...; done should return 429 after 60)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added rate limiting to the error reporting API
    • Implemented global asset capture limits (count and disk storage)
    • Added page capture time limit (120 seconds default)
    • Added browser log queue size cap
  • Bug Fixes & Improvements

    • Enhanced privacy: query strings and fragments now stripped from logged URLs
    • Added control character sanitization in logs
    • Stricter content-type validation for error reports (rejects non-JSON)
    • Improved HTML asset rewriting to avoid corrupting scripts and comments
  • Documentation

    • Updated audit documentation; all Phase 3 hardening items completed

…I strip + ANSI sanitize

Closes audit P2-3, P2-4, P2-6, P1-I.

- pyproject.toml: gunicorn 21.2.0 → 22.0.0 (CVE-2024-1135 HTTP request
  smuggling). Add flask-limiter dep for batch 3.
- app.py:client_errors:
  - P2-3: drop force=True; reject non-application/json content-types with
    415 (was: text/plain bypass for CORS preflight).
  - P1-I: strip query string + fragment from logged URL via _strip_query()
    helper. Avoids LGPD/GDPR-relevant token leakage to log aggregators.
  - P2-4: regex-strip control chars (\x00-\x08\x0b-\x1f\x7f) in _truncate.
    Prevents ANSI escape injection (\x1b[2J\x1b[H clears dev terminal).
- tests/test_client_errors.py: +7 cases (text/plain rejected, no-CT
  accepted, charset accepted, URL with query handled, _strip_query unit,
  ANSI sanitized, _truncate strips control chars).
- .github/workflows/ci.yml: new pip-audit job (osv vulnerability service,
  soft gate). pytest job updated to install flask-limiter.

Tests 62 → 69 passing in 0.71s. ruff clean.
Closes audit P1-E, P2-2, L3.

- templates/index.html: hard cap 200 entries on browser logger queue. If
  flush() never runs (extension blocks setInterval) or page generates errors
  faster than ship rate, oldest entries are dropped via shift(). Bounded
  memory growth.
- kratos_clone/capture.py CaptureConfig:
  - KCD_MAX_SCROLL_S (default 120) — wall-clock budget for 3-pass scroll
  - KCD_MAX_TOTAL_MB (default 200) — cumulative asset bytes cap
  - KCD_MAX_ASSETS (default 500) — per-capture file-count cap
- kratos_clone/capture.py _three_pass_scroll: check budget before each
  iteration; break early; emit scroll_budget_exceeded=true in manifest.
- kratos_clone/capture.py _on_response: enforce both caps before fetching
  body and before write_bytes; increment _asset_count_dropped counter.
- Manifest now includes scroll_budget_exceeded, asset_caps_dropped,
  total_asset_bytes — operators see truncation reasons.

Tests still 69 passing (no regressions; new behavior gated by configs that
default to current limits which test fixtures don't exceed).
Closes audit P1-F, P2-5. All 9 P1 findings now resolved.

- kratos_clone/post.py: rewrite_html_assets refactored to use BeautifulSoup.
  Walks URL-bearing attributes only (src/href/srcset/data-src/...) plus
  style attribute and <style> block url() refs. Script bodies, HTML
  comments, JSON data attributes are NEVER touched. Fast-path early-return
  preserves byte-stable output when captured_assets is empty (matters for
  test fixtures and downstream byte-stable diffing).
- app.py: Flask-Limiter wired up at module level via init_app(). Default
  60/min/IP on /api/client-errors via @limiter.limit decorator. Storage
  URI configurable (RATE_LIMIT_STORAGE_URI=redis://... for multi-worker).
  Tests disable rate limit via app.config["RATELIMIT_ENABLED"] = False
  in conftest.py — production gets the cap.
- tests/test_post.py: +5 cases (BS4 P1-F regressions: script body URL not
  rewritten, HTML comment URL preserved, srcset multi-URL handled, inline
  style url() rewritten, <style> block url() rewritten).
- docs/AUDIT.md: P1-E, P1-F, P1-I all marked ✅ RESOLVED with mechanism.
  All 9 P1 items now closed.
- TODO.md / CLAUDE.md: Phase 3 logged in Done; known-issues section
  collapsed (P0 zero, P1 zero remaining).

Tests 69 → 74 passing in 1.10s. ruff clean.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

Warning

Rate limit exceeded

@fbmoulin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 46 minutes and 50 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c1ae424-1d56-4d05-969f-a110f8e430b4

📥 Commits

Reviewing files that changed from the base of the PR and between 12ebcbe and 8b550a7.

📒 Files selected for processing (5)
  • .github/workflows/ci.yml
  • app.py
  • kratos_clone/capture.py
  • kratos_clone/post.py
  • templates/index.html
📝 Walkthrough

Walkthrough

This PR completes Phase 3 production hardening by implementing rate limiting on /api/client-errors, enforcing strict JSON content-type validation (returning 415 on mismatch), sanitizing client logs by stripping URLs of query strings/fragments and removing ANSI/control characters, adding configurable scroll time and asset disk/count budgets, replacing naive string-based HTML asset rewriting with BeautifulSoup-aware attribute/CSS parsing, capping browser log queues, and introducing a CI vulnerability audit step via pip-audit.

Changes

Cohort / File(s) Summary
CI & Dependencies
.github/workflows/ci.yml, pyproject.toml
Adds pip-audit job for OSV-backed vulnerability scanning; upgrades gunicorn to >=22.0.0 and introduces flask-limiter>=4.0.0 dependency.
Audit & Documentation
CLAUDE.md, docs/AUDIT.md
Updates audit status to reflect all 9 P1 findings closed (P1-E, P1-F, P1-I resolved via asset limits, BS4 rewriting, and log sanitization); redirects to open P2/P3 items.
Completion Tracking
TODO.md
Marks Phase 3 production hardening as completed with detailed implementation summary.
Core Hardening: Endpoint & Sanitization
app.py
Adds rate limiting via flask_limiter on /api/client-errors; enforces JSON content-type (415 on mismatch); implements URL query/fragment stripping and ANSI/control-character removal in log sanitization.
Core Hardening: Capture & Asset Management
kratos_clone/capture.py
Introduces configurable scroll wall-clock budget (max_scroll_seconds), global asset count limit (max_assets), and cumulative byte limit (max_total_asset_mb); tracks dropped assets and budget-exceeded conditions in manifest.
Core Hardening: HTML Rewriting
kratos_clone/post.py
Replaces naive substring replacement with BeautifulSoup-based traversal; rewrites URLs only in configured attributes (src, href, data-*, poster, xlink:href, data-srcset), srcset items, CSS url(...) contexts, and meta content; updates orphan CSS detection logic.
Frontend: Log Queue
templates/index.html
Adds hard cap to browser error logger queue; discards oldest entries when QUEUE_HARD_MAX is reached, preventing unbounded growth.
Testing: Infrastructure & Coverage
tests/conftest.py, tests/test_client_errors.py, tests/test_post.py
Disables rate limiting in test fixtures; adds 98 lines of /api/client-errors tests covering content-type rejection, URL sanitization, and ANSI handling; adds 70 lines of regression tests for BS4-aware HTML rewriting to verify script/comment preservation and CSS url() handling.

Sequence Diagram

sequenceDiagram
    participant Browser as Client Browser
    participant Limiter as Rate Limiter
    participant App as Flask App
    participant Sanitize as Sanitization<br/>(strip_query, strip_ansi)
    participant Log as Structured Log
    
    Browser->>Limiter: POST /api/client-errors<br/>(client-side errors)
    Limiter-->>Browser: 429 if quota exceeded
    Limiter->>App: Request passed (within limit)
    
    App->>App: Check Content-Type<br/>== application/json?
    alt Invalid Content-Type
        App-->>Browser: 415 Unsupported Media Type
    else Valid JSON
        App->>App: Parse request JSON
        App->>Sanitize: strip_query(url)
        Sanitize-->>App: URL without params/fragments
        App->>Sanitize: strip_ansi(message)
        Sanitize-->>App: Message without control chars
        App->>Log: Emit sanitized error entry
        Log-->>App: ✓ Logged
        App-->>Browser: 200 OK
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 Phase Three hops past the finish line!
Rate limits guard the error shrine,
BeautifulSoup rewrite with flair,
Strips PII from the digital air,
Asset caps and queue bounds align—
Production hardening, now divine!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective: completing Phase 3 production hardening and closing all 9 P1 audit findings, which is the primary focus across all file changes.
Docstring Coverage ✅ Passed Docstring coverage is 81.25% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/phase3-prod-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12ebcbe465

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread kratos_clone/capture.py
Comment on lines +533 to 537
if len(self.captured_assets) >= self.cfg.max_assets:
self._asset_count_dropped += 1
return
try:
body = await response.body()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce asset caps atomically across response tasks

The new disk-cap guards are checked before an await response.body() boundary, so concurrent _on_response tasks can all pass the cap checks using stale state and then write anyway. On pages that trigger many simultaneous asset responses, this can exceed both max_assets and max_total_asset_mb, which re-opens the DoS/disk exhaustion risk this change is meant to close. The cap check and counter updates need to be synchronized (or re-checked immediately before write) within a critical section.

Useful? React with 👍 / 👎.

Comment thread kratos_clone/post.py
Comment on lines +88 to +90
for item in value.split(","):
item = item.strip()
if not item:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse srcset without breaking data-URL candidates

Splitting srcset on raw commas corrupts valid candidates that contain commas inside the URL itself (notably data: URLs, which always include a comma separator). Because this rewrite runs on every srcset attribute even when no replacement is needed, pages using inline placeholder images can end up with malformed srcset values and broken image loading after capture.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements Phase 3 production hardening, focusing on security, privacy, and system robustness. Key changes include upgrading gunicorn to address CVE-2024-1135, enforcing strict JSON content-types, and adding rate limiting to the client error reporting endpoint. Privacy is improved by stripping PII from logged URLs, while robustness is enhanced through global asset disk caps, a wall-clock budget for scrolling operations, and a hard cap on the browser logger queue. Additionally, the HTML asset rewriter was refactored to be BeautifulSoup-aware to prevent corruption of script bodies. Review feedback suggests externalizing rate limit configurations, refining the placement of asset count checks to exclude failed requests, and minor optimizations for attribute lookups during HTML rewriting.

Comment thread app.py


@app.route("/api/client-errors", methods=["POST"])
@limiter.limit(os.getenv("CLIENT_ERRORS_RATE_LIMIT", "60 per minute"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The rate limit configuration is hardcoded to '60 per minute'. It is better to use a constant or a configuration variable to make it easier to manage and update in the future.

Comment thread kratos_clone/capture.py
Comment on lines +533 to +535
if len(self.captured_assets) >= self.cfg.max_assets:
self._asset_count_dropped += 1
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The asset count cap check should be performed after the response status check to avoid counting failed requests towards the limit.

Comment thread kratos_clone/post.py
Comment on lines +137 to +139
for attr in _URL_ATTRS:
if el.has_attr(attr):
el[attr] = _try_replace(el[attr], url_map)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop over _URL_ATTRS can be optimized by pre-compiling the list or using a set for faster lookup if the list grows, though it is currently small.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
kratos_clone/capture.py (1)

575-622: ⚠️ Potential issue | 🟡 Minor

Minor: scroll_budget_exceeded not set when an entire pass is skipped at the entry guard.

The flag is only flipped inside the per-iteration for loops. If pass 1 completes fully and the budget is exhausted by the time pass 2 (or pass 3) is evaluated, not over_budget() short-circuits the pass entry but self.scroll_budget_exceeded stays False. The manifest then reports scroll_budget_exceeded: false despite passes being silently dropped — exactly the case the flag is supposed to signal to operators. Same for any pre-budget-blown initial entry.

🔧 Proposed fix — flip the flag at the entry guards too
-        # Pass 1: forward fast (warm-up)
-        if self.cfg.scroll_passes >= 1 and not over_budget():
+        # Pass 1: forward fast (warm-up)
+        if self.cfg.scroll_passes >= 1:
+            if over_budget():
+                self.scroll_budget_exceeded = True
+            else:
                 self.log("📜 Scroll pass 1/3 (forward fast)...")
                 for y in range(0, h + vh, int(vh * self.cfg.scroll_jump_ratio_fast)):
                     if over_budget():
                         self.scroll_budget_exceeded = True
                         break
                     ...

(Apply the same shape to the pass 2 and pass 3 entry guards on lines 587 and 600.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kratos_clone/capture.py` around lines 575 - 622, The loop-entry guards for
the three scroll passes check not over_budget() but don't set
self.scroll_budget_exceeded when a whole pass is skipped; update the pass guards
in the method (where you check self.cfg.scroll_passes >= 1/2/3 and not
over_budget()) to set self.scroll_budget_exceeded = True in the else/skip branch
(and optionally log) so that when over_budget() prevents entering a pass (for
passes 1, 2, and 3) the flag is flipped consistently; reference the existing
symbols self.cfg.scroll_passes, over_budget(), and self.scroll_budget_exceeded
to locate and change the checks.
docs/AUDIT.md (1)

30-32: ⚠️ Potential issue | 🟡 Minor

Stale status banner contradicts the resolution rows below.

This banner still lists P1-E / P1-F / P1-I as the remaining open items, but the table immediately below now marks all three RESOLVED Phase 3 (lines 40, 41, 44). Update the banner so the doc agrees with itself — the elsewhere-cited claim "all 9 P1 closed" (e.g., CLAUDE.md line 122 and TODO.md Phase 3 entry) hinges on this section being internally consistent.

📝 Suggested edit
-> **Status update 2026-04-27:** P1-A, P1-B, P1-C, P1-D, P1-G, P1-H all RESOLVED in
-> Phase 1 (`b54939a`) and Phase 2 (`feat/phase2-structural-fixes`). Remaining open:
-> P1-E (asset disk cap), P1-F (BeautifulSoup-aware rewriting), P1-I (PII strip).
+> **Status update 2026-04-27:** All 9 P1 findings RESOLVED across Phases 1–3
+> (P1-A/B/C/D/G/H in Phases 1–2; P1-E/F/I in Phase 3 — see rows below for commit
+> SHAs and implementation notes).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/AUDIT.md` around lines 30 - 32, The "Status update 2026-04-27" banner
still lists P1-E, P1-F, and P1-I as remaining open while the table below marks
them RESOLVED; update that banner text to match the table (e.g., state that all
P1 items are RESOLVED / "all 9 P1 closed" and remove P1-E/P1-F/P1-I from the
remaining-open list) so the banner and the Phase 3 resolution rows are
consistent; change the line that begins with "**Status update 2026-04-27:**"
accordingly and keep the referenced commit/branch annotations (`b54939a`,
`feat/phase2-structural-fixes`) intact.
TODO.md (1)

19-28: ⚠️ Potential issue | 🟡 Minor

Inconsistent state: Phase 3 items are listed as unchecked here but line 40 marks Phase 3 complete.

Lines 21-28 list every P1/P2 item in this PR as - [ ] under "🟡 Then (Phase 3 — production hardening)", while the Done log entry on line 40 claims Phase 3 is complete and all 9 P1 items are closed. Pick one source of truth — either tick them off in place or delete the section now that it's superseded.

Suggested cleanup
-## 🟡 Then (Phase 3 — production hardening)
-
-- [ ] Global asset disk cap — `KCD_MAX_TOTAL_MB=200`, `KCD_MAX_ASSETS=500` env vars + cumulative tracking in `_on_response`. (P1-E)
-- [ ] Three-pass scroll wall-clock budget — `KCD_MAX_SCROLL_S=120`. Emit `scroll_budget_exceeded` flag in manifest. (P2-2)
-- [ ] Strip query strings from `location.href` in browser logger; ANSI-strip in `app.py:_truncate`. (P1-I + P2-4, ~15m)
-- [ ] Drop `force=True` in `request.get_json()`; return 415 on wrong content-type. (P2-3, ~5m)
-- [ ] Rate-limit `/api/client-errors` via Flask-Limiter. (P2-5, ~30m)
-- [ ] Bump `gunicorn>=22.0.0` (CVE-2024-1135) + `pip-audit` job. (P2-6, ~15m)
-- [ ] Browser logger queue cap (`if (queue.length > 200) queue.shift()`). (~5m)
-- [ ] BeautifulSoup-aware HTML rewriting in `post.py` (replace raw `str.replace`). (P1-F, ~1.5h)
-
----
-
+<!-- Phase 3 items moved to Done ✅ on 2026-04-27 — see entry below. -->

Also worth fixing the section header on line 11 — both "Now" and "Then" claim to be Phase 3.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TODO.md` around lines 19 - 28, The TODO.md has inconsistent Phase 3 state:
the "🟡 Then (Phase 3 — production hardening)" checklist (header text "🟡 Then
(Phase 3 — production hardening)") lists all items unchecked while the Done log
later (around the Done entry) claims Phase 3 is complete; resolve by choosing a
single source of truth — either mark the checklist items as completed (change "-
[ ]" to "- [x]" for the nine P1/P2 entries under "🟡 Then") or remove the entire
"🟡 Then (Phase 3 — production hardening)" section if the Done log is
authoritative; also fix the duplicated phase header at the top (the "Now" vs
"Then" Phase 3 label) so headers correctly reflect phases (e.g., rename one to
Phase 2 or adjust wording).
🧹 Nitpick comments (4)
kratos_clone/post.py (1)

67-80: Consider non-overlapping replacement to avoid mutation-chain surprises in _try_replace.

The substring fallback iterates and mutates out in place. If two captured URLs share text with each other or with a local path produced by an earlier replacement, later iterations match against the rewritten string rather than the original input, which can corrupt unrelated bytes (e.g., a short captured URL that happens to occur as a substring of an assets/<name>_<hash>.<ext> token already substituted). Hashed asset filenames make this rare in practice, but the algorithm is technically order-dependent on its own outputs.

A safer pattern: collect non-overlapping match ranges against the original value (longest-first) and rebuild the string in one pass.

♻️ Sketch — replace against the original value only
 def _try_replace(value: str, url_map: dict[str, str]) -> str:
     if not value:
         return value
     if value in url_map:
         return url_map[value]
-    out = value
-    for url in sorted(url_map.keys(), key=len, reverse=True):
-        if url in out:
-            out = out.replace(url, url_map[url])
-    return out
+    # Non-overlapping replacement against the original; longest-first wins.
+    out_parts: list[str] = []
+    i = 0
+    keys = sorted(url_map.keys(), key=len, reverse=True)
+    while i < len(value):
+        for url in keys:
+            if value.startswith(url, i):
+                out_parts.append(url_map[url])
+                i += len(url)
+                break
+        else:
+            out_parts.append(value[i])
+            i += 1
+    return "".join(out_parts)

Not blocking — current behaviour is fine for real URL sets, but the rewrite removes a class of latent bug.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kratos_clone/post.py` around lines 67 - 80, The substring-replacement loop in
_try_replace mutates out as it goes, causing later replacements to match against
already-rewritten text; change _try_replace to scan the original value for
non-overlapping matches of keys(url_map.keys()) (sorted longest-first), collect
their start/end ranges and replacement values, then rebuild the result in a
single pass using the original string and those ranges so no replacement can
affect subsequent matches; keep the exact-match early-return (if value in
url_map) and apply the substring-only logic when falling back.
.github/workflows/ci.yml (1)

152-155: Consider surfacing pip-audit findings beyond stdout while the gate is soft.

|| true keeps the build green, which is fine for a soft gate — but unless someone opens the job logs, real CVEs slip past unnoticed. Two cheap wins until Phase 6 flips this strict:

  • Pipe to $GITHUB_STEP_SUMMARY so findings show up on the run page (pip-audit --format=markdown ... >> "$GITHUB_STEP_SUMMARY" || true).
  • Or use --format=json + actions/upload-artifact so the full report is reviewable per run.

Also, the runtime-dep list now appears verbatim in three steps (lines 53, 129, 150). Worth extracting into pyproject.toml extras (pip install .[ci]) or a shared composite action when convenient — purely an optional cleanup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 152 - 155, Update the CI pip-audit
step so findings are surfaced beyond stdout while keeping the soft gate: run
pip-audit with a machine-readable or markdown format (use pip-audit
--format=markdown ... >> "$GITHUB_STEP_SUMMARY" || true or pip-audit
--format=json ... && upload via actions/upload-artifact) instead of only
printing to console; also consider centralizing the repeated runtime-dep install
lists by defining a CI extras entry in pyproject.toml and replacing duplicate
pip install commands with pip install .[ci] or a shared composite action to DRY
the workflow.
app.py (1)

468-483: Optional: collapse _strip_query to one urlsplit call.

The manual character search works and the unit test covers the cases, but urllib.parse.urlsplit is the idiomatic and slightly more robust path (it also normalizes a few edge cases like trailing whitespace). Purely cosmetic — feel free to skip.

Suggested simplification
-def _strip_query(url):
-    """P1-I: drop query string + fragment from logged URLs.
-
-    Avoids capturing tokens, session IDs, PII parameters into structured logs
-    that may be exported to 3rd-party aggregators (LGPD/GDPR concern). Keeps
-    scheme + host + pathname only.
-    """
-    if not url:
-        return None
-    s = str(url)
-    for sep in ("?", "#"):
-        i = s.find(sep)
-        if i >= 0:
-            s = s[:i]
-    return s
+def _strip_query(url):
+    """P1-I: drop query string + fragment from logged URLs.
+
+    Avoids capturing tokens, session IDs, PII parameters into structured logs
+    that may be exported to 3rd-party aggregators (LGPD/GDPR concern). Keeps
+    scheme + host + pathname only.
+    """
+    if not url:
+        return None
+    parts = urlsplit(str(url))
+    return urlunsplit((parts.scheme, parts.netloc, parts.path, "", ""))
+from urllib.parse import urlsplit, urlunsplit
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app.py` around lines 468 - 483, Replace the manual character-search logic in
_strip_query with a single urllib.parse.urlsplit call: import urlsplit (from
urllib.parse) and call urlsplit(url) to build the result using only the scheme,
netloc (host), and path components (omitting query and fragment), handling
None/empty inputs the same and returning None when url is falsy; update
_strip_query to return the recomposed string (scheme + "://" + netloc + path) so
it preserves behavior but uses urlsplit for robustness and normalization.
tests/test_client_errors.py (1)

213-237: Recommended: assert the actual sanitized output, not just status code.

test_ansi_escape_sanitized_in_message only checks resp.status_code == 200, which means a future regression that reverts the _truncate sub but still returns 200 would slip through. The test_truncate_strips_control_chars unit test is already strict — but consider also capturing the structlog output (e.g., via caplog or a frontend_logger patch) to assert end-to-end that the rendered log line contains no \x1b.

Not a blocker — the unit test on _truncate provides the real safety net. Filing as optional polish.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_client_errors.py` around lines 213 - 237, Update the
test_ansi_escape_sanitized_in_message to assert the actual sanitized output
rather than only status code: post the same payload to "/api/client-errors" and
then capture the rendered log output (e.g., using pytest's caplog or by patching
the frontend_logger used by the endpoint) and assert that the logged message no
longer contains "\x1b" and contains the expected sanitized form (matching what
_truncate produces); reference the existing helper _truncate to derive the
expected sanitized string and ensure the test asserts that the endpoint's
end-to-end log contains that sanitized value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app.py`:
- Around line 58-65: The Limiter is being constructed with enabled=True and
init_app(app) so later changes to app.config["RATELIMIT_ENABLED"] in tests do
not disable it; update app creation so the Limiter reads the config before init:
read app.config["RATELIMIT_ENABLED"] (or environment variable) and pass that
boolean into Limiter(... enabled=...) or call limiter.init_app(app) only after
setting app.config["RATELIMIT_ENABLED"]; alternatively document and use
app.extensions['limiter'].enabled = False in conftest.py to disable the
already-initialized Limiter (reference symbols: Limiter, init_app,
app.config["RATELIMIT_ENABLED"], app.extensions['limiter'].enabled,
conftest.py).

---

Outside diff comments:
In `@docs/AUDIT.md`:
- Around line 30-32: The "Status update 2026-04-27" banner still lists P1-E,
P1-F, and P1-I as remaining open while the table below marks them RESOLVED;
update that banner text to match the table (e.g., state that all P1 items are
RESOLVED / "all 9 P1 closed" and remove P1-E/P1-F/P1-I from the remaining-open
list) so the banner and the Phase 3 resolution rows are consistent; change the
line that begins with "**Status update 2026-04-27:**" accordingly and keep the
referenced commit/branch annotations (`b54939a`, `feat/phase2-structural-fixes`)
intact.

In `@kratos_clone/capture.py`:
- Around line 575-622: The loop-entry guards for the three scroll passes check
not over_budget() but don't set self.scroll_budget_exceeded when a whole pass is
skipped; update the pass guards in the method (where you check
self.cfg.scroll_passes >= 1/2/3 and not over_budget()) to set
self.scroll_budget_exceeded = True in the else/skip branch (and optionally log)
so that when over_budget() prevents entering a pass (for passes 1, 2, and 3) the
flag is flipped consistently; reference the existing symbols
self.cfg.scroll_passes, over_budget(), and self.scroll_budget_exceeded to locate
and change the checks.

In `@TODO.md`:
- Around line 19-28: The TODO.md has inconsistent Phase 3 state: the "🟡 Then
(Phase 3 — production hardening)" checklist (header text "🟡 Then (Phase 3 —
production hardening)") lists all items unchecked while the Done log later
(around the Done entry) claims Phase 3 is complete; resolve by choosing a single
source of truth — either mark the checklist items as completed (change "- [ ]"
to "- [x]" for the nine P1/P2 entries under "🟡 Then") or remove the entire "🟡
Then (Phase 3 — production hardening)" section if the Done log is authoritative;
also fix the duplicated phase header at the top (the "Now" vs "Then" Phase 3
label) so headers correctly reflect phases (e.g., rename one to Phase 2 or
adjust wording).

---

Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 152-155: Update the CI pip-audit step so findings are surfaced
beyond stdout while keeping the soft gate: run pip-audit with a machine-readable
or markdown format (use pip-audit --format=markdown ... >>
"$GITHUB_STEP_SUMMARY" || true or pip-audit --format=json ... && upload via
actions/upload-artifact) instead of only printing to console; also consider
centralizing the repeated runtime-dep install lists by defining a CI extras
entry in pyproject.toml and replacing duplicate pip install commands with pip
install .[ci] or a shared composite action to DRY the workflow.

In `@app.py`:
- Around line 468-483: Replace the manual character-search logic in _strip_query
with a single urllib.parse.urlsplit call: import urlsplit (from urllib.parse)
and call urlsplit(url) to build the result using only the scheme, netloc (host),
and path components (omitting query and fragment), handling None/empty inputs
the same and returning None when url is falsy; update _strip_query to return the
recomposed string (scheme + "://" + netloc + path) so it preserves behavior but
uses urlsplit for robustness and normalization.

In `@kratos_clone/post.py`:
- Around line 67-80: The substring-replacement loop in _try_replace mutates out
as it goes, causing later replacements to match against already-rewritten text;
change _try_replace to scan the original value for non-overlapping matches of
keys(url_map.keys()) (sorted longest-first), collect their start/end ranges and
replacement values, then rebuild the result in a single pass using the original
string and those ranges so no replacement can affect subsequent matches; keep
the exact-match early-return (if value in url_map) and apply the substring-only
logic when falling back.

In `@tests/test_client_errors.py`:
- Around line 213-237: Update the test_ansi_escape_sanitized_in_message to
assert the actual sanitized output rather than only status code: post the same
payload to "/api/client-errors" and then capture the rendered log output (e.g.,
using pytest's caplog or by patching the frontend_logger used by the endpoint)
and assert that the logged message no longer contains "\x1b" and contains the
expected sanitized form (matching what _truncate produces); reference the
existing helper _truncate to derive the expected sanitized string and ensure the
test asserts that the endpoint's end-to-end log contains that sanitized value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f2666fe-c29e-4a9f-b90b-2f1a98292707

📥 Commits

Reviewing files that changed from the base of the PR and between ea6cf1a and 12ebcbe.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • .github/workflows/ci.yml
  • CLAUDE.md
  • TODO.md
  • app.py
  • docs/AUDIT.md
  • kratos_clone/capture.py
  • kratos_clone/post.py
  • pyproject.toml
  • templates/index.html
  • tests/conftest.py
  • tests/test_client_errors.py
  • tests/test_post.py

Comment thread app.py Outdated
Comment on lines +58 to +65
limiter = Limiter(
key_func=get_remote_address,
storage_uri=os.getenv("RATE_LIMIT_STORAGE_URI", "memory://"),
# Disabled in tests via app.config["RATELIMIT_ENABLED"] = False to keep
# parametrized tests fast.
enabled=True,
)
limiter.init_app(app)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In Flask-Limiter 3.x and 4.x, does app.config["RATELIMIT_ENABLED"] = Falseset afterLimiter(..., enabled=True).init_app(app)actually disable rate limiting at request time, or isenabled only evaluated once at init?

💡 Result:

No, setting app.config["RATELIMIT_ENABLED"] = False after Limiter(..., enabled=True).init_app(app) does not disable rate limiting at request time in Flask-Limiter 3.x and 4.x. The enabled parameter is evaluated once during init_app. The config RATELIMIT_ENABLED (or constructor enabled=True) sets an instance attribute self.enabled on the Limiter object at initialization time in init_app. This attribute controls whether the app.before_request(self._check_request_limit) hook is registered. Once registered (with enabled=True), the hook runs on every request, performing rate limit checks regardless of later config changes. Evidence: - Official docs confirm RATELIMIT_ENABLED maps to constructor enabled: overall kill switch, defaults to True (source 1 from first search). - Source code snippet from _extension.py: if not self.enabled skips init_app sequence including hook registration (source 11). - Test test_disabled_flag in repo: sets config {ConfigVars.ENABLED: False} (i.e., RATELIMIT_ENABLED=False) before routes/decorators, confirms no limiting even on decorated route (11+ requests succeed) (source 15). - Issue #75: Users set limiter.enabled = False or config after init to disable in tests successfully, implying runtime check of self.enabled in _check_request_limit (source 2). - Decorator wrappers always call self.limiter._check_request_limit(...) unconditionally (source 14). Changing config post-init does not update self.enabled or unregister the hook. Decorated routes' wrappers execute checks per-request, but if self.enabled is True (set at init), limits apply unless explicitly exempted. To disable post-init: access the Limiter instance (e.g., app.extensions['limiter']) and set limiter.enabled = False, assuming _check_request_limit early-returns on not self.enabled. Config changes alone fail because they don't propagate to the instance state.

Citations:


🏁 Script executed:

#!/bin/bash
# Make sure the test suite exercises the rate-limit path under load, not just one request.
rg -nP -C2 'client-errors|client_errors' -g 'tests/**'
echo '---'
# Confirm the conftest fixture sets RATELIMIT_ENABLED before any request fires
fd -t f 'conftest.py' -X cat

Repository: fbmoulin/kratos-clone

Length of output: 10031


Fix rate limit disabling in tests — RATELIMIT_ENABLED config flag does not work after init_app().

Setting app.config["RATELIMIT_ENABLED"] = False in conftest.py after Limiter(..., enabled=True).init_app(app) does not actually disable rate limiting. The enabled property is evaluated once at init time; post-init config changes do not propagate to the instance. Tests are not isolated from the 60-request/min/IP limiter, causing potential flaky 429 responses on rapid request sequences.

Use app.extensions['limiter'].enabled = False in conftest instead, or set RATELIMIT_ENABLED via environment/config before calling init_app(). Alternatively, ensure the limiter is disabled at app creation time if tests require true isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app.py` around lines 58 - 65, The Limiter is being constructed with
enabled=True and init_app(app) so later changes to
app.config["RATELIMIT_ENABLED"] in tests do not disable it; update app creation
so the Limiter reads the config before init: read
app.config["RATELIMIT_ENABLED"] (or environment variable) and pass that boolean
into Limiter(... enabled=...) or call limiter.init_app(app) only after setting
app.config["RATELIMIT_ENABLED"]; alternatively document and use
app.extensions['limiter'].enabled = False in conftest.py to disable the
already-initialized Limiter (reference symbols: Limiter, init_app,
app.config["RATELIMIT_ENABLED"], app.extensions['limiter'].enabled,
conftest.py).

Multi-agent review of Phase 3 surfaced 4 legitimate findings + 1 CI miss:

ADVERSARIAL (P1-I closed for real, not just server-side):
  templates/index.html: client-side strip query+fragment via _safeUrl()
  helper. The url field shipped to /api/client-errors is now
  location.origin + location.pathname only, NOT location.href. This closes
  the transport-layer leak that server-side _strip_query() couldn't (URL
  with ?token=secret already reached proxy/gunicorn access logs before
  Flask handlers fired). The audit P1-I claim now reflects reality.

ADVERSARIAL (silent disk write failures):
  kratos_clone/capture.py: _on_response now distinguishes 'capped' from
  'write failed'. New self._asset_write_failed counter; OSError on
  write_bytes increments it + appends to errors[]. Manifest gains
  'asset_write_failed: int' field. Operators no longer see N captured
  assets when only N-K exist on disk.

ADVERSARIAL (silent cap drops):
  kratos_clone/capture.py: first-drop warning log at the moment the cap
  is hit (count or bytes), separately for each. Operators see the cap
  during capture instead of post-hoc in manifest.

SECURITY (P1: rate limit defeated by reverse proxy):
  app.py: TRUST_PROXY=1 env var enables Werkzeug ProxyFix middleware.
  Without it, deployments behind nginx/Cloudflare/Fly.io see
  request.remote_addr=127.0.0.1 for every request and rate limit becomes
  a single global bucket. Off by default (spoofable when not behind a
  trusted proxy that strips client X-Forwarded-For). Logged at startup
  when enabled.

SECURITY (P3: _URL_ATTRS coverage):
  kratos_clone/post.py: extended _URL_ATTRS with cite, data, formaction,
  action, manifest, background. Rare on marketing pages but real on
  legacy/SVG-embedded HTML; previously broken offline.

CI MISS:
  .github/workflows/ci.yml: smoke job missing flask-limiter in install
  list (pytest job had it). Fixed.

Tests still 74 passing, ruff clean. Files: app.py + templates/index.html
+ kratos_clone/{capture,post}.py + .github/workflows/ci.yml.
Copy link
Copy Markdown

@code-review-doctor code-review-doctor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to consider. View full project report here.

Comment thread kratos_clone/capture.py
self._asset_count_dropped += 1
if self._asset_count_dropped == 1:
self.log(
f"⚠️ Asset bytes cap reached "
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"⚠️ Asset bytes cap reached "
"⚠️ Asset bytes cap reached "

f-string is unnecessary here. This can just be a string. More.

…-free

flask-limiter's MemoryStorage spawns a janitor thread when init_app() runs
with the default memory:// storage. Module-level init was tripping the CI
smoke test ('threading.active_count() before == after on import').

Move limiter.init_app() inside create_app() — production / dev / wsgi all
call it; tests skip via existing fixture. Storage URI now read from
app.config[RATELIMIT_STORAGE_URI] (set from env var RATE_LIMIT_STORAGE_URI
in create_app), keeping the Limiter() constructor side-effect-free.

  before=1 after=1 ok=True   ← bare 'import app' (no thread spawn)
  74 tests pass.
@fbmoulin fbmoulin merged commit 9f4c453 into main Apr 27, 2026
6 checks passed
@fbmoulin fbmoulin deleted the feat/phase3-prod-hardening branch April 27, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant