Fix fenced code block escaping in extract_relevant_lines_str#2312
Fix fenced code block escaping in extract_relevant_lines_str#2312evandrocoan wants to merge 5 commits intoThe-PR-Agent:mainfrom
Conversation
When a file's extracted lines contain their own fenced code blocks (e.g. a README with a ```bash section), wrapping them in the standard triple-backtick fence caused the Markdown parser to misinterpret the inner ``` as a closing fence. This broke rendering inside GitHub <details> collapsible blocks, producing a stray ``` before </details>. Introduce _get_fence(content) in pr_agent/algo/utils.py. The helper scans the content for the longest consecutive run of backticks and returns a fence that is one backtick longer (minimum three). This guarantees the outer fence can never be prematurely closed by any backtick sequence inside the content. Update extract_relevant_lines_str to call _get_fence instead of hard-coding ``` . The guard "if relevant_lines_str" that now wraps the fence assignment also prevents building a fence string around an empty result, which was a latent bug. Add tests/unittest/test_extract_relevant_lines_str.py covering: - Empty/missing file list returns an empty string. - Single-line and multi-line extraction with correct fences. - Language identifier propagated into the opening fence. - Optional dedent of common leading whitespace. - Fallback path that mines lines from the patch when head_file is None. - Filename with leading/trailing spaces is matched correctly. - Content containing ``` triggers a 4-backtick outer fence, and the inner block is preserved verbatim. - End-to-end check via convert_to_markdown_v2 with a real README containing a ```bash block: the <details> body uses ````markdown and the inner ```bash is kept intact.
Review Summary by QodoFix fenced code block escaping in extract_relevant_lines_str
WalkthroughsDescription• Add _get_fence() helper to dynamically determine fence length • Prevents inner backtick sequences from prematurely closing outer fence • Fixes markdown rendering in GitHub <details> collapsible blocks • Comprehensive test suite covering edge cases and end-to-end scenarios Diagramflowchart LR
A["Content with inner backticks"] --> B["_get_fence scans for longest run"]
B --> C["Returns fence one backtick longer"]
C --> D["extract_relevant_lines_str wraps with dynamic fence"]
D --> E["Markdown renders correctly in details blocks"]
File Changes1. pr_agent/algo/utils.py
|
Code Review by Qodo
1.
|
Updated the content construction in the _test_make_file method for better readability by using parentheses for multiline strings. Also improved the assertion message in the test case to enhance clarity when the expected output does not match.
|
Persistent review updated to latest commit ea88c41 |
The `_get_fence` utility previously only considered backtick sequences when computing a safe outer fence for Markdown code blocks. When content (such as a README) contained triple-backtick fenced blocks, the function would produce a four-backtick fence. For content with even longer backtick runs, the fence could grow arbitrarily long, risking truncation by LLM providers or Markdown renderers with line-length limits. The updated algorithm evaluates both backtick and tilde fence candidates independently, then selects whichever yields the shorter safe fence. Since most real-world Markdown content uses backtick fences and rarely contains tilde sequences, a triple-backtick block in the content now triggers a three-tilde fence rather than a four-backtick fence. This keeps the outer fence at the minimum safe length in the common case. The tie-breaking rule preserves the previous default: when both candidates produce the same length, backticks are returned, so plain content without either character still produces the familiar ``` fence. All affected unit tests are updated to assert tilde fences where the old logic would have produced four-backtick fences, and a new dedicated test suite for `_get_fence` covers empty input, single-character runs, long runs of each character, mixed content, tie-breaking, and the invariant that the returned fence never appears verbatim inside the content. Reduces the risk of provider-side truncation caused by unexpectedly long fence lines when wrapping Markdown content that itself contains fenced code blocks.
|
Persistent review updated to latest commit 6ca6bf5 |
… double quotes Updated string literals in the test file to use double quotes for consistency and improved readability. This change affects various sections of the code, including the definition of Python code blocks and the construction of synthetic files for testing. The adjustments ensure uniformity in string formatting across the test cases.
|
Persistent review updated to latest commit 8aaea37 |
…unction The change updates string literals in the `_get_fence` function from single quotes to double quotes. This is a purely cosmetic, style-consistency change with no behavioral impact. Python projects often enforce a single quoting style across the codebase to maintain readability and reduce cognitive overhead when reading code. Tools like `black`, the widely adopted Python formatter, default to double quotes. Aligning with this convention ensures the codebase remains consistent and passes automated style checks without requiring exceptions or suppressions. Standardizing quote style reduces noise in future diffs, making it easier to spot meaningful changes rather than stylistic ones.
|
Persistent review updated to latest commit 2a98e81 |
When a file's extracted lines contain their own fenced code blocks (e.g. a README with a
```bashsection), wrapping them in the standard triple-backtick fence caused the Markdown parser to misinterpret the inner```as a closing fence. This broke rendering inside GitHub<details>collapsible blocks, producing a stray```before</details>.Introduce _get_fence(content) in pr_agent/algo/utils.py. The helper scans the content for the longest consecutive run of backticks and returns a fence that is one backtick longer (minimum three). This guarantees the outer fence can never be prematurely closed by any backtick sequence inside the content.
Update extract_relevant_lines_str to call _get_fence instead of hard-coding
```. The guard "if relevant_lines_str" that now wraps the fence assignment also prevents building a fence string around an empty result, which was a latent bug.Add tests/unittest/test_extract_relevant_lines_str.py covering:
```bashblock: the<details>body uses````markdownand the inner```bashis kept intact.