Skip to content

Surface raw HTML content through translation pair pipeline for allowH…#740

Open
dadukhankevin wants to merge 5 commits intomainfrom
fix-html-preserving-predictions
Open

Surface raw HTML content through translation pair pipeline for allowH…#740
dadukhankevin wants to merge 5 commits intomainfrom
fix-html-preserving-predictions

Conversation

@dadukhankevin
Copy link
Copy Markdown
Contributor

…tmlPredictions

The allowHtmlPredictions toggle was incomplete: HTML was stripped at SQLite index time but the raw content (already stored in s_raw_content/t_raw_content columns) was never surfaced to the prompt builder. Now rawContent flows through MinimalCellResult so buildFewShotExamplesText can use HTML-preserving examples when the toggle is on.

…tmlPredictions

The allowHtmlPredictions toggle was incomplete: HTML was stripped at SQLite
index time but the raw content (already stored in s_raw_content/t_raw_content
columns) was never surfaced to the prompt builder. Now rawContent flows through
MinimalCellResult so buildFewShotExamplesText can use HTML-preserving examples
when the toggle is on.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dadukhankevin dadukhankevin requested a review from Fikitti March 12, 2026 04:57
dadukhankevin and others added 4 commits March 12, 2026 17:38
- Restructure system message: consolidate 12 appended instructions into
  focused paragraphs, reduce noise and redundancy
- Simplify HTML preservation: always instruct model to preserve HTML from
  source, toggle only controls whether HTML is present in examples/context
- Fix temperature passthrough: always send configured temperature instead
  of silently dropping it for the default model
- Remove redundant "Instructions" header from user message
- Add diagnostic logging for system message and few-shot example HTML

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep raw source HTML in the current-task prompt when HTML predictions are enabled while still using sanitized text for example search. Add regression coverage to ensure the prompt preserves source markup and spacing boundaries.

Made-with: Cursor
…ions

- Adjusted assertions to verify that the system message includes the target language "fr" or "French".
- Updated format instructions check to ensure it mentions HTML/formatting handling instead of just plain text when HTML is disabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants