docs: add translation image stabilization specification for issue #137#140
docs: add translation image stabilization specification for issue #137#140
Conversation
This specification document addresses the problem of expiring Notion/S3 image URLs in translated documentation. It provides: - Technical analysis of the current translation pipeline - Proposed solution integrating existing image processing infrastructure - Detailed 5-phase implementation plan with time estimates - Testing requirements and acceptance criteria - Risk analysis and rollback strategy The solution reuses the existing processAndReplaceImages() and validateAndFixRemainingImages() functions from the notion-fetch pipeline to stabilize image URLs before and after translation. Closes #137 Co-authored-by: openhands <openhands@all-hands.dev>
Updates to TRANSLATION_IMAGE_STABILIZATION_SPEC.md: Per-language image overrides: - Add section explaining default vs override behavior - Document how translators can use localized screenshots - Define precedence rules (language page wins, EN fallback) - Add how-to snippet with before/after example Pipeline improvements: - Add language-aware source selection step in pipeline flow - Update acceptance criteria for localized images Technical clarifications: - Add function contracts section with exact input/output types - Fix line number typo (858-518 → 858-890) - Add invariants section with grep verification commands - Make failure policy explicit (fail page, no placeholders) - Clarify validation catches mutated canonical paths Testing: - Add per-language override test case - Require mocked network/filesystem in tests Cleanup: - Remove time estimates from phases - Reuse existing slug logic (no new implementation) - Expand out-of-scope section (no migration, no new storage) Co-authored-by: openhands <openhands@all-hands.dev>
- Add detailed review of original spec with 17 issues identified - Add simplified spec that's easier to understand (ELI5 style) - Key changes: removed redundant items, fixed fallback behavior, clarified edge cases, added implementation details 🤖 Generated with [Qoder][https://qoder.com]
…ons (#141) * docs: fix spec accuracy issues found during code review Address all issues identified in the spec review by verifying against actual codebase: - Fix function name to processSinglePageTranslation (not processPageTranslation) - Fix slug helper to include page ID suffix for deterministic filenames - Replace validateAndFixRemainingImages (warns only) with getImageDiagnostics + throw - Add totalFailures > 0 check to fail on broken placeholder images - Defer per-language image overrides to future enhancement (EN-only initially) - Downgrade prompt enhancement to P2 (already 3 existing instructions) - Fix import list (remove unused validateAndFixRemainingImages) - Remove incorrect metrics.skippedSmallSize reference in logging - Add Prerequisites section for EN fetch recommendation - Document frontmatter images as out of scope - Update review document with resolution status https://claude.ai/code/session_01RMuRBsneMpCL1cBVLuVcmx * docs: add implementation plan for translation image stabilization (#137) Machine-readable implementation plan with atomic tasks, exact code locations, dependency graph, and complete test specifications. Ready for autonomous execution. https://claude.ai/code/session_01RMuRBsneMpCL1cBVLuVcmx * fix: address review feedback inconsistencies - Fix __tests__/ path to flat path in simplified spec - Fix title -> originalTitle variable name consistency - Add note about invoking processSinglePageTranslation in tests 🤖 Generated with [Qoder][https://qoder.com] * fix(spec): align SPEC with PLAN for logging and TASK-3.6 - Make diagnostic logging conditional (only log when successfulImages > 0) - Fix TASK-3.6 label to match PLAN (Test title page bypass) --------- Co-authored-by: Claude <noreply@anthropic.com>
Code Review SummaryStatus: No Issues Found | Recommendation: Merge This PR adds documentation files only (no code changes):
Total: 4 new files, 2074 lines of documentation The documentation appears well-structured and comprehensive, covering:
Since this PR contains only markdown documentation files with no executable code, there are no security vulnerabilities, runtime errors, or logic bugs to review. Files Reviewed (4 files)
|
| - [ ] After running `bun scripts/notion-translate`, translated markdown contains **zero** URLs matching: | ||
| - `secure.notion-static.com` | ||
| - `notion-static.com` | ||
| - `amazonaws.com` | ||
| - `X-Amz-` params | ||
| - `www.notion.so/image/` | ||
| - [ ] Images in translated pages reference `/images/<filename>` and resolve at build time | ||
| - [ ] Images are not duplicated per language (translations reuse shared `/images` assets) | ||
| - [ ] Works for both Markdown image syntax (``) and inline HTML (`<img src="...">`) | ||
| - [ ] Idempotent: re-running translation does not cause churn in image links | ||
| - [ ] Localized Notion pages may intentionally use different images; these must also be rewritten to stable `/images/...` paths (no Notion/S3 URLs) — **deferred to future enhancement; initial implementation uses EN images only** | ||
| - [ ] If any image fails to download, the page translation fails with a diagnostic error (no broken placeholder images) |
There was a problem hiding this comment.
PR should not close #137
The PR description says "Closes #137" but this PR only adds specification documents — no code changes. When this PR merges, GitHub will auto-close issue #137 even though the actual implementation hasn't been done yet. The Closes #137 keyword should be removed from the PR description and reserved for the implementation PR that actually delivers the code changes described in this spec.
Context Used: Context from dashboard - CLAUDE.md (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: TRANSLATION_IMAGE_STABILIZATION_SPEC.md
Line: 441:452
Comment:
**PR should not close #137**
The PR description says "Closes #137" but this PR only adds specification documents — no code changes. When this PR merges, GitHub will auto-close issue #137 even though the actual implementation hasn't been done yet. The `Closes #137` keyword should be removed from the PR description and reserved for the implementation PR that actually delivers the code changes described in this spec.
**Context Used:** Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=56ef855b-65f1-4964-8b62-134035692d00))
How can I resolve this? If you propose a fix, please make it concise.| # Review: TRANSLATION_IMAGE_STABILIZATION_SPEC.md | ||
|
|
||
| **Reviewed:** 2026-02-13 | ||
| **Spec File:** `TRANSLATION_IMAGE_STABILIZATION_SPEC.md` | ||
| **Issue:** #137 - Notion translation pipeline: replace expiring Notion/S3 image URLs with canonical `/images/...` paths | ||
|
|
||
| --- | ||
|
|
||
| ## Overall Assessment | ||
|
|
There was a problem hiding this comment.
Review document adds root-level clutter
This self-review document is useful as an internal working artifact, but committing it to the repository root alongside the spec adds clutter. The review findings have already been incorporated into the updated spec (as noted in the "Resolution Status" section at the bottom). Consider either:
- Moving this to a
context/subdirectory (e.g.,context/development/) consistent with the project structure - Removing it from the PR and linking to it as a PR comment instead
- Keeping only the spec and plan files, since the review's findings are already reflected in them
Context Used: Context from dashboard - CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: TRANSLATION_IMAGE_STABILIZATION_SPEC_REVIEW.md
Line: 1:10
Comment:
**Review document adds root-level clutter**
This self-review document is useful as an internal working artifact, but committing it to the repository root alongside the spec adds clutter. The review findings have already been incorporated into the updated spec (as noted in the "Resolution Status" section at the bottom). Consider either:
- Moving this to a `context/` subdirectory (e.g., `context/development/`) consistent with the project structure
- Removing it from the PR and linking to it as a PR comment instead
- Keeping only the spec and plan files, since the review's findings are already reflected in them
**Context Used:** Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=56ef855b-65f1-4964-8b62-134035692d00))
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| # Translation Image Stabilization - Simplified Specification | ||
|
|
||
| **Issue:** #137 - Replace expiring image URLs with stable paths in translated docs | ||
|
|
||
| **Goal:** When we translate English docs to other languages, make sure the images don't break over time. | ||
|
|
||
| --- | ||
|
|
||
| ## The Problem (Simple Explanation) | ||
|
|
There was a problem hiding this comment.
Consider consolidating with the main spec
This simplified version largely duplicates information from TRANSLATION_IMAGE_STABILIZATION_SPEC.md. Adding 4 new root-level markdown files for one feature spec is notable — the repo already has IMAGE_URL_EXPIRATION_SPEC.md and NOTION_FETCH_ARCHITECTURE.md as precedent, but neither has companion review/simplified/plan documents alongside them. Consider consolidating the key "simplified" context as a section within the main spec, or moving this to context/development/ to keep the root directory clean.
Context Used: Context from dashboard - CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: TRANSLATION_IMAGE_STABILIZATION_SPEC_SIMPLIFIED.md
Line: 1:10
Comment:
**Consider consolidating with the main spec**
This simplified version largely duplicates information from `TRANSLATION_IMAGE_STABILIZATION_SPEC.md`. Adding 4 new root-level markdown files for one feature spec is notable — the repo already has `IMAGE_URL_EXPIRATION_SPEC.md` and `NOTION_FETCH_ARCHITECTURE.md` as precedent, but neither has companion review/simplified/plan documents alongside them. Consider consolidating the key "simplified" context as a section within the main spec, or moving this to `context/development/` to keep the root directory clean.
**Context Used:** Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=56ef855b-65f1-4964-8b62-134035692d00))
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| - Verify both markdown and HTML image syntax work | ||
|
|
||
| 4. **Idempotency** | ||
| - Test re-running doesn't cause churn | ||
| - Already-processed content should remain unchanged | ||
|
|
||
| 5. **Per-Language Image Overrides** | ||
| - Test: EN markdown references image A, localized markdown references image B (Notion URL) | ||
| - Result: uses image B rewritten to `/images/...` | ||
| - Confirm EN fallback remains unchanged when no override exists |
There was a problem hiding this comment.
Per-language override tests contradict scope
The spec explicitly defers per-language image overrides to a "future enhancement" (lines 113-116, 451, 689), yet the testing requirements include tests for per-language image override behavior (tests #5 here). These tests would have no implementation to test against in the initial scope. Either remove these test cases from the initial testing requirements, or add a note that they should be written as skipped/todo tests to document the expected future behavior.
Prompt To Fix With AI
This is a comment left during a code review.
Path: TRANSLATION_IMAGE_STABILIZATION_SPEC.md
Line: 388:397
Comment:
**Per-language override tests contradict scope**
The spec explicitly defers per-language image overrides to a "future enhancement" (lines 113-116, 451, 689), yet the testing requirements include tests for per-language image override behavior (tests #5 here). These tests would have no implementation to test against in the initial scope. Either remove these test cases from the initial testing requirements, or add a note that they should be written as skipped/todo tests to document the expected future behavior.
How can I resolve this? If you propose a fix, please make it concise.
Summary
This PR adds a comprehensive technical specification document for addressing issue #137 - replacing expiring Notion/S3 image URLs in translated documentation with stable canonical
/images/...paths.Problem
The Notion translation workflow (
scripts/notion-translate/index.ts) produces translated Markdown with expiring Notion/S3 image URLs that break over time (~1 hour expiration). Meanwhile, the English fetch pipeline already handles this correctly by downloading images tostatic/images/and rewriting URLs.Solution Overview
The specification proposes integrating the existing image processing pipeline into the translation flow:
processAndReplaceImages()to stabilize image URLs before sending to OpenAIvalidateAndFixRemainingImages()as a safety net to catch any remaining S3 URLsThis approach reuses existing, tested code rather than creating new image handling logic.
What's Included
📄
TRANSLATION_IMAGE_STABILIZATION_SPEC.mdcontaining:/images/...paths (reuse EN images) #137)Implementation Plan Summary
/images/pathsAcceptance Criteria (from issue #137)
After implementation, translated markdown will contain:
secure.notion-static.com,notion-static.com,amazonaws.com,X-Amz-*, orwww.notion.so/image//images/<filename>references that resolve at build timeand<img src="...">syntaxTesting
This PR only adds documentation (specification). Testing will be part of the implementation PR.
Closes #137
@luandro can click here to continue refining the PR
Greptile Summary
This PR adds four specification/planning documents for issue #137 — stabilizing expiring Notion/S3 image URLs in translated documentation. No code changes are included; this is purely documentation.
TRANSLATION_IMAGE_STABILIZATION_SPEC.md: Core technical specification proposing integration of the existingprocessAndReplaceImages()pipeline into the translation flow (pre-translation image download + post-translation S3 URL validation). Code references verified as accurate against the current codebase.TRANSLATION_IMAGE_STABILIZATION_PLAN.md: Detailed implementation plan with task-level breakdown, dependency graph, concrete code diffs, and test scaffolding.TRANSLATION_IMAGE_STABILIZATION_SPEC_REVIEW.md: Self-review of the spec identifying issues (all resolved in the updated spec). Adds repo root clutter — consider removing or relocating.TRANSLATION_IMAGE_STABILIZATION_SPEC_SIMPLIFIED.md: Non-technical summary that largely duplicates the main spec. Consider consolidating./images/...paths (reuse EN images) #137" but this is a docs-only PR — merging will auto-close the issue before the actual implementation is delivered. TheCloseskeyword should be removed.Confidence Score: 4/5
/images/...paths (reuse EN images) #137" keyword which would prematurely close the issue, and (2) four new root-level files where two would suffice, adding organizational clutter./images/...paths (reuse EN images) #137" to avoid prematurely closing the issue before implementation.Important Files Changed
Sequence Diagram
sequenceDiagram participant N as Notion API participant T as Translation Pipeline participant IR as imageReplacer participant FS as static/images/ participant AI as OpenAI API participant D as Disk (i18n/) Note over T: Current Flow (broken) N->>T: Markdown with S3 URLs T->>AI: Translate (S3 URLs intact) AI->>T: Translated markdown (S3 URLs) T->>D: Save (URLs expire in ~1hr) Note over T: Proposed Flow (stable) N->>T: Markdown with S3 URLs T->>IR: processAndReplaceImages() IR->>FS: Download images (cached) IR->>T: Markdown with /images/... paths alt Image download failed T->>T: Throw error (fail page) end T->>AI: Translate (stable paths) AI->>T: Translated markdown T->>IR: getImageDiagnostics() alt S3 URLs found T->>T: Throw error (fail page) end T->>D: Save with stable /images/... pathsLast reviewed commit: 9f73b74
Context used:
dashboard- CLAUDE.md (source)