feat: Phase 4 — Personalization MVP by fbmoulin · Pull Request #7 · fbmoulin/kratos-clone

fbmoulin · 2026-04-27T17:34:18Z

Closes Phase 4 of ROADMAP.md. Implements docs/PERSONALIZATION.md as code.

Summary

New personalize/ package: slots, sanitize, openai_client, patcher, pipeline, cli (~750 LoC).
3 Flask routes: GET /personalize + POST /api/personalize/{structure,run} (rate-limited 5/min and 2/min/IP).
Hybrid intake template (templates/personalize.html) with browser observability logger reused.
65 new tests (74 → 139 passing in 2.7s) + 2 live OpenAI smoke tests gated by RUN_OPENAI_LIVE=1.
Closes audit P2-11 (LLM input/output hardening — text + image + HTML).

Architecture (per docs/PERSONALIZATION.md)

Patch-based, not full-regen. LLM emits a JSON patch list with closed-enum slot_id; deterministic Python applies patches via BeautifulSoup. Zero selector hallucination.
Responses API with text.format json_schema strict=true for both structure_brief (Step 2) and personalize (Step 5).
gpt-image-1 medium for hero/feature illustrations; first call sets style, rest run via asyncio.gather with first as input_images. Avatars are CSS gradient + initials (no synthetic faces — EU AI Act + OpenAI policy).
Hard budget cap (default $1.00) on OpenAIBrandClient. Projected cost checked BEFORE every call.

Live validation

```
RUN_OPENAI_LIVE=1 uv run pytest tests/integration -v -s
test_structure_brief_live PASSED ($0.005, 10s)
test_personalize_live_with_vision_no_image_gen PASSED ($0.10, 70s)
spent: $0.1050; remaining: $0.3950
```

Real gpt-5-mini accepts the schemas. Closed-enum slot_id honored — every patch came back with a slot id from the list we passed.

Security (P2-11 closure)

Threat	Mitigation
Prompt injection via brief text	sanitize_brief_text strips C0 controls + bounds length; brief fields go in via json.dumps, never f-string
SVG XSS via logo upload	verify_image_bytes allow-lists PNG/JPEG by magic; rejects SVG/GIF/anything else
Embedded EXIF in user logos	strip_exif re-encodes via Image.copy() round-trip
Future LLM-emitted HTML	strip_dangerous_html removes script/style/iframe/object/embed, drops on*=, neutralizes javascript: URLs
Path traversal via html_dir	realpath().startswith(downloads/) confinement in /api/personalize/run

Test plan

uv run pytest -q → 139 passed, 2 skipped (integration gated)
uv run ruff check && ruff format --check → all green
python -m personalize --dry-run → extracts 12 slots from sample inventory
import app thread-count invariant preserved (factory pattern, P2-7)
Live OpenAI E2E smoke run on gpt-5-mini → both tests pass
Manual UX walk-through on /personalize after merge

Out of scope

Streaming UI / SSE for run progress (long-poll OK for MVP)
A/B harness for the +70% claim (P2-9 stays open)
Multi-language brief input (English-only first cut)
Style reference upload beyond logo

…dit P2-11 - sanitize_brief_text: strip C0 controls (preserve \t \n \r), bound length - verify_image_bytes: allow-list PNG/JPEG by magic; reject SVG/GIF/other - strip_exif: copy() round-trip drops exif/xmp/icc, preserves pixels - strip_dangerous_html: remove script/style/iframe/object/embed, drop on*= handlers, neutralize javascript: URLs in href/src/action Uses html.parser (stdlib) per repo convention, not lxml. 21 new tests; full suite 105 passing in 0.92s.

- Text patches resolve via slot.selector; missing selectors logged + skipped - word-wrappers structure-aware split for staggered hero h1 - Palette regex covers from/to/via/bg/text/border/shadow/ring × 400/500/600 - Images written as PNG to <out>/assets/gen_<slot_id>.png + src updated - All LLM-derived values run through strip_dangerous_html (defense-in-depth) - Uses html.parser (stdlib) per repo convention 7 tests; full suite 112 passing.

- structure_brief: gpt-5-mini Responses API + brand_brief strict schema - personalize: multimodal (logo + slots) + dynamic closed-enum schema for patches[].slot_id and images[].slot_id (deterministic resolution; no selector hallucinations) - generate_images_parallel: AsyncOpenAI gpt-image-1 medium; first call sets style, rest run via asyncio.gather with first as input_images - Hard budget cap (default $1.00) enforced BEFORE every call; raises BudgetExceededError without burning the API. Total spent tracked. - All user inputs sanitized; logo verified + EXIF-stripped before upload - pytest-asyncio added (strict mode); 12 mocked tests cover all 3 paths Full suite 124 passing.

- run_pipeline() chains slot extraction → structure_brief → personalize → async image gen → apply_personalization - Fail-fast on bad logo bytes BEFORE any API call (SVG/empty rejected) - Inventory + index.html missing → FileNotFoundError before LLM - structured_brief_override skips Step 2 (UI lets user edit fields) - dry_run=True validates inputs and logs slot summary, no API calls - Each step's failure logged with step= key before re-raising 7 tests; full suite 131 passing.

Routes: - GET /personalize — render intake form (template added in Task 10) - POST /api/personalize/structure — JSON brief → structured fields (5/min) - POST /api/personalize/run — multipart brief+logo+html_dir → personalized.html (2/min) Hardening: - Global MAX_CONTENT_LENGTH bumped 1 MiB → 8 MiB to fit 2 MiB logo upload; per-route caps stay strict (4 KiB structure, 5 MiB run, 2 MiB logo) - html_dir confined to downloads/ via realpath check (no traversal) - Lazy import of personalize.* inside handlers — keeps import app side-effect-free - Smoke verified: threading.active_count() unchanged on import 8 route tests; full suite 139 passing.

- templates/personalize.html: 3-step UI (brief textarea + logo + html_dir → edit-extracted-fields → run + result panel), no separate JS file - Browser observability logger copied from index.html (same _rawFetch capture-before-wrap pattern; PR #1 lesson) - Smoke verified: GET /personalize returns 200 with all selectors present

- argparse: html_dir positional + --brief --logo --budget --dry-run - python-dotenv loads OPENAI_API_KEY from .env if present - --dry-run validates inputs and emits slot summary, zero API calls - Smoke-tested: dry-run on sample fixture extracts 12 slots in <1s

- test_structure_brief_live: ~$0.005, 10s — validates brand_brief schema - test_personalize_live_with_vision_no_image_gen: ~$0.10, 70s — validates multimodal personalize call + site_personalization closed-enum schema - Validated locally: gpt-5-mini available, ~$0.105 spent total - Image generation not exercised live (gpt-image-1 $0.07/call) — covered by mocked tests; live path is identical SDK pattern Default pytest run skips both: 139 passed, 2 skipped.

- docs/AUDIT.md: P2-11 marked RESOLVED with module pointer + test count - ROADMAP.md: Phase 4 SHIPPED block with landed-in module map - TODO.md: Phase 4 done entry; Later section now Phase 5-6 only - CLAUDE.md: new 'Personalization module' section with conventions (always inject OpenAI client in tests, budget guard fires before call, closed-enum slot IDs, no synthetic faces, Responses not Assistants) - docs/superpowers/plans/: 13-task TDD plan that drove the implementation Branch state: 11 commits, 139 tests + 2 live (gated), all green.

coderabbitai · 2026-04-27T17:34:25Z

Warning

Rate limit exceeded

@fbmoulin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 52 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a33da78-4398-4e0b-888b-57f4fb075000

📥 Commits

Reviewing files that changed from the base of the PR and between 2b48d3a and f20d7d2.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (28)

.github/workflows/ci.yml
CLAUDE.md
ROADMAP.md
TODO.md
app.py
docs/AUDIT.md
docs/superpowers/plans/2026-04-27-personalization-mvp.md
personalize/__init__.py
personalize/__main__.py
personalize/cli.py
personalize/openai_client.py
personalize/patcher.py
personalize/pipeline.py
personalize/sanitize.py
personalize/slots.py
pyproject.toml
templates/personalize.html
tests/fixtures/sample_captured.html
tests/fixtures/sample_inventory.json
tests/integration/__init__.py
tests/integration/test_personalize_live.py
tests/test_personalize_app.py
tests/test_personalize_openai_client.py
tests/test_personalize_patcher.py
tests/test_personalize_pipeline.py
tests/test_personalize_sanitize.py
tests/test_personalize_slots.py
tests/test_personalize_smoke.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/personalize-mvp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Phase 4 introduced these deps but the CI install lines are explicit (legacy pattern, not pip install -e .) so each new dep needs to be added. pytest job also needs pytest-asyncio for the new async tests. Affects: pytest job (collection-time PIL ImportError), pip-audit job (installed-package set), smoke job (defensive — kratos_clone doesn't import PIL today but the env should match).

gemini-code-assist

Code Review

This pull request implements Phase 4 of the project, adding a personalization module that leverages OpenAI's gpt-5-mini and gpt-image-1 to customize captured websites. The changes include a new "personalize" package for slot extraction, security hardening, and HTML patching, alongside new Flask routes and a frontend intake form. The review highlights a potential issue with global regex replacements in the palette application logic, suggesting direct DOM manipulation for better correctness and performance. It also recommends making the pipeline orchestrator asynchronous to avoid event loop conflicts when integrated into async environments.

gemini-code-assist · 2026-04-27T17:36:07Z

+    html = str(soup)
+    html = re.sub(
+        rf"\b({_PREFIX_GROUP})-orange-500\b",
+        rf"\1-[#{primary}]",
+        html,
+    )
+    html = re.sub(
+        rf"\b({_PREFIX_GROUP})-orange-600\b",
+        rf"\1-[#{primary_pressed}]",
+        html,
+    )
+    html = re.sub(
+        rf"\b({_PREFIX_GROUP})-orange-400\b",
+        rf"\1-[#{primary_hover}]",
+        html,
+    )
+    return BeautifulSoup(html, "html.parser")


The current implementation of _apply_palette performs a global string replacement on the entire HTML document using regex and then re-parses the result. This approach has two significant drawbacks:

Correctness: It may inadvertently modify text content, IDs, or other attributes that happen to match the Tailwind class pattern (e.g., a paragraph explaining Tailwind classes or an element with a matching ID).

Efficiency: Converting the entire BeautifulSoup object to a string and re-parsing it multiple times is computationally expensive, especially for large documents.

Consider iterating over elements that have a class attribute and updating the classes directly in the DOM. For example:

for tag in soup.find_all(class_=True): classes = tag.get("class", []) new_classes = [] for cls in classes: # Apply replacement logic to individual class names new_classes.append(cls.replace("orange-500", f"[#{primary}]")) tag["class"] = new_classes

gemini-code-assist · 2026-04-27T17:36:07Z

+
+    # Step 6 — async image generation.
+    try:
+        images = asyncio.run(client.generate_images_parallel(plan, slots))


Using asyncio.run() inside a synchronous function like run_pipeline can lead to a RuntimeError if the code is executed in an environment where an event loop is already running (e.g., within an async web framework or a task runner).

To improve robustness and future-proof the pipeline, consider making run_pipeline an async function and allowing the caller to manage the event loop execution, or providing an asynchronous version of the orchestrator.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c17e5afa16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T17:38:44Z

+                        {"type": "input_text", "text": user_payload},
+                        {
+                            "type": "input_image",
+                            "image_url": f"data:image/png;base64,{logo_b64}",


Set logo data URL MIME from validated image type

OpenAIBrandClient.personalize accepts both PNG and JPEG logos (verify_image_bytes), but this payload is always sent as data:image/png to Responses. For valid JPEG uploads, that MIME/content mismatch can make the vision input fail to decode, so /api/personalize/run may return a 502 for inputs the API explicitly claims to support. Build the data URL from the detected format (image/png vs image/jpeg) to keep accepted JPEG uploads functional.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-27T17:38:45Z

+@limiter.limit(os.getenv("PERSONALIZE_RUN_RATE_LIMIT", "2 per minute"))
+def personalize_run():
+    """Steps 4–8 — apply personalization to a captured site."""
+    if (request.content_length or 0) > _PERSONALIZE_RUN_MAX_BYTES:


Enforce run payload cap using actual body length

This size gate only checks request.content_length, which is unset for chunked uploads; in that case it evaluates as 0 and skips the 5 MiB guard. A client can therefore bypass this per-route limit and still force multipart parsing up to the global 8 MiB cap, increasing memory/CPU exposure on /api/personalize/run. Mirror the /api/client-errors pattern by checking the buffered body length (or equivalent) instead of relying solely on Content-Length.

Useful? React with 👍 / 👎.

(c1) DOM-walk palette swap (was global regex on serialized HTML) - _apply_palette now iterates every element with a 'class' attribute and rewrites individual class tokens via match-or-keep. Previously a regex against str(soup) could match 'from-orange-500' inside text content, IDs, aria-labels, etc. Test case test_palette_swap_does_not_touch_text_content asserts <code>from-orange-500</code>, id=, aria-label= are preserved. - Class normalization handles both bs4 multi-valued (list) and html.parser (str) shapes; explicit type annotations satisfy mypy. (c2) async pipeline variant (was bare asyncio.run inside sync function) - arun_pipeline: async public API; awaits client.generate_images_parallel directly. Composes inside any caller-managed event loop (FastAPI etc.). - run_pipeline: sync wrapper. Detects a running loop via asyncio.get_running_loop and raises a clear RuntimeError pointing at arun_pipeline instead of dying inside asyncio.run with the confusing 'cannot be called from a running event loop' trace. - 2 new tests: arun works inside event loop; run refuses inside event loop. Tests: 183 → 186 passing. mypy clean on personalize/. bandit HIGH=0.

fbmoulin added 11 commits April 27, 2026 14:03

feat(personalize): scaffold package + deps (openai/Pillow/python-dotenv)

c9ac5eb

feat(personalize): slot extractor with structured logging (Step 4)

023c0a9

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

fbmoulin merged commit 5e4863a into main Apr 27, 2026
6 checks passed

fbmoulin deleted the feat/personalize-mvp branch April 27, 2026 17:37

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

fbmoulin mentioned this pull request Apr 27, 2026

fix(personalize): close 2 Gemini MEDIUM findings from PR #7 #14

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase 4 — Personalization MVP#7

feat: Phase 4 — Personalization MVP#7
fbmoulin merged 12 commits intomainfrom
feat/personalize-mvp

fbmoulin commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fbmoulin commented Apr 27, 2026

Summary

Architecture (per docs/PERSONALIZATION.md)

Live validation

Security (P2-11 closure)

Test plan

Out of scope

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading