fix(patch): guard against None match in hunk header extraction by gvago · Pull Request #2330 · The-PR-Agent/pr-agent

gvago · 2026-04-16T10:59:51Z

Summary

Guard extract_hunk_headers(match) calls against None match results in two functions within pr_agent/algo/git_patch_processing.py
In decouple_and_convert_to_hunks_with_lines_numbers (line ~379): moved extract_hunk_headers inside the existing if match: block and added continue for non-matching @@ lines
In extract_hunk_lines_from_patch (line ~434): added an explicit if not match: guard that sets skip_hunk = True and continues, preventing the crash
NEW: Clear line buffers unconditionally on every @@ line (valid or malformed) to prevent orphan lines between a malformed @@ and the next valid @@ from leaking into the next hunk
Added test test_orphan_lines_after_malformed_not_joined_to_next_hunk to verify orphan lines are discarded

Bug

If a line starts with @@ but doesn't fully match the RE_HUNK_HEADER regex pattern, re.match() returns None. The code then calls extract_hunk_headers(None), which invokes None.groups() and raises AttributeError: 'NoneType' object has no attribute 'groups'.

Additionally, orphan lines between a malformed @@ (where match=None) and the next valid @@ were leaking into the next hunk because the buffer reset was inside the if prev_match: flush block — when prev_match was None, buffers were never cleared.

Test plan

All 8 tests in test_malformed_hunk_header.py pass
New test verifies orphan lines after malformed @@ are not joined to the next hunk
Verify normal patch processing still works correctly with valid hunk headers

Replaces #2322 (lost push access to org branch).

In `decouple_and_convert_to_hunks_with_lines_numbers` and `extract_hunk_lines_from_patch`, the call to `extract_hunk_headers(match)` was outside proper `if match:` guards. If a line starts with `@@` but doesn't match `RE_HUNK_HEADER`, `match` is None, causing an `AttributeError` crash on `match.groups()`. Move the `extract_hunk_headers` call inside the match guard in both functions, and skip malformed hunk header lines gracefully.

… @@ lines The decouple_and_convert_to_hunks_with_lines_numbers() function overwrote `match` with the new RE_HUNK_HEADER result before checking whether the previous hunk needed to be finalized. When a malformed @@ line produced match=None, the flush condition `if match and (new/old_content_lines)` was False, silently dropping the previous hunk's content. Fix: save `prev_match` before overwriting and use it for the flush decision. Also use `prev_header_line` instead of `match`/`header_line` in the post-loop finalization, so a trailing malformed @@ cannot suppress the last valid hunk. Adds 7 unit tests covering malformed @@ scenarios: crash safety, content preservation, trailing malformed headers, line-number accuracy, all-malformed patches, and deletion-only hunks.

Orphan lines between a malformed @@ (match=None) and the next valid @@ were leaking into the next hunk. The buffer reset was inside the `if prev_match:` flush block, so when prev_match was None (set by the malformed @@), the buffers were never cleared. Move the buffer reset outside the conditional so it runs on every @@ encounter, and add a test that verifies orphan lines are discarded.

qodo-free-for-open-source-projects · 2026-04-16T11:00:07Z

Review Summary by Qodo

Fix crash and data loss from malformed hunk header parsing

🐞 Bug fix

Walkthroughs

Description

• Guard against None match results from malformed @@ hunk headers
• Prevent orphan lines between malformed and valid hunks leaking
• Preserve content from valid hunks before/after malformed headers
• Add comprehensive test coverage for malformed hunk scenarios

Diagram

flowchart LR
  A["Malformed @@ line<br/>match=None"] -->|Before| B["Crash on<br/>match.groups()"]
  A -->|After| C["Guard check<br/>skip gracefully"]
  D["Orphan lines<br/>in buffer"] -->|Before| E["Leak into<br/>next hunk"]
  D -->|After| F["Clear buffers<br/>unconditionally"]
  G["Trailing malformed @@<br/>overwrites match"] -->|Before| H["Last valid hunk<br/>not finalized"]
  G -->|After| I["Use prev_header_line<br/>for finalization"]

File Changes

1. pr_agent/algo/git_patch_processing.py 🐞 Bug fix +18/-8

Guard against None match in hunk header extraction

• Save prev_match before overwriting to properly flush previous hunk before processing new @@
 line
• Move buffer reset outside conditional to clear on every @@ line, preventing orphan lines from
 leaking
• Guard extract_hunk_headers() call with if match: check and skip malformed headers with
 continue
• Use prev_header_line in final hunk finalization to handle trailing malformed @@ lines
 correctly
• Add explicit if not match: guard in extract_hunk_lines_from_patch() to prevent crash

pr_agent/algo/git_patch_processing.py

2. tests/unittest/test_malformed_hunk_header.py 🧪 Tests +151/-0

Comprehensive tests for malformed hunk header handling

• Add 8 unit tests covering malformed @@ hunk header scenarios
• Test crash safety, content preservation, and line number accuracy
• Verify orphan lines between malformed and valid hunks are discarded
• Cover edge cases: trailing malformed headers, deletion-only hunks, all-malformed patches

tests/unittest/test_malformed_hunk_header.py

qodo-free-for-open-source-projects · 2026-04-16T11:00:08Z

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

🐞\ ≡ Correctness (1)

1. EOF orphan lines leaked 🐞 ≡

Description

decouple_and_convert_to_hunks_with_lines_numbers can append lines that occur after a trailing
malformed "@@" header to the previous valid hunk at EOF, duplicating that hunk header and
mis-numbering the appended lines. This happens because malformed headers are skipped but subsequent
"+"/"-" lines are still buffered, and the EOF flush is triggered by prev_header_line rather than by
an “active valid hunk” flag.

Code

pr_agent/algo/git_patch_processing.py[R399-406]

+    # finishing last hunk — use prev_header_line (not match/header_line) because
+    # match may have been set to None by a trailing malformed @@ line, and
+    # header_line may point to that malformed line instead of the last valid hunk
+    if prev_header_line and new_content_lines:
+        patch_with_lines_str += f'\n{prev_header_line}\n'
        is_plus_lines = is_minus_lines = False
        if new_content_lines:
            is_plus_lines = any([line.startswith('+') for line in new_content_lines])

Evidence

On a malformed hunk header, the code clears the line buffers but then continues without clearing
prev_header_line, so the function is no longer “in a valid hunk” but still remembers the last
valid header. Subsequent patch lines (e.g., "+orphan") are appended to new_content_lines
regardless of whether a valid hunk header was parsed. At EOF, the function finalizes based on `if
prev_header_line and new_content_lines:`, so those orphan lines get emitted under the previous valid
header with stale start2—duplicating the header and corrupting the output.

pr_agent/algo/git_patch_processing.py[355-384]
pr_agent/algo/git_patch_processing.py[386-389]
pr_agent/algo/git_patch_processing.py[399-416]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`decouple_and_convert_to_hunks_with_lines_numbers()` can still emit orphan lines if a malformed `@@` header occurs near the end of the patch and is followed by content lines (e.g., `+...`, `-...`, or context). Those lines get buffered and then flushed at EOF under the last valid `prev_header_line`, duplicating the previous hunk header and producing incorrect line-numbered output.

## Issue Context
You already skip malformed `@@` lines via `continue`, and you reset `new_content_lines`/`old_content_lines` on every `@@`. However, after a malformed `@@`, the function currently continues buffering subsequent lines even though it is no longer in a valid hunk, and the EOF flush uses `prev_header_line` + `new_content_lines` as the condition.

## Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[355-416]
- tests/unittest/test_malformed_hunk_header.py[61-76]

## Suggested fix approach
- Introduce an explicit state like `in_valid_hunk` (or reuse `match is not None` carefully) that is set to `True` only after a valid hunk header parse, and set to `False` when encountering a malformed `@@`.
- Only append `+`/`-`/context lines to buffers when `in_valid_hunk` is `True`.
- Alternatively (minimum change): when `match` is falsy in the `@@` branch, also clear `prev_header_line` (and optionally reset `start1/start2/...`) so the EOF finalization cannot attach any later buffered lines to a previous hunk.
- Add a regression test: valid hunk -> malformed `@@` -> orphan lines -> EOF, asserting those orphan lines are not present and the previous hunk header isn’t duplicated.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

qodo-free-for-open-source-projects · 2026-04-16T11:05:05Z

+    # finishing last hunk — use prev_header_line (not match/header_line) because
+    # match may have been set to None by a trailing malformed @@ line, and
+    # header_line may point to that malformed line instead of the last valid hunk
+    if prev_header_line and new_content_lines:
+        patch_with_lines_str += f'\n{prev_header_line}\n'
        is_plus_lines = is_minus_lines = False
        if new_content_lines:
            is_plus_lines = any([line.startswith('+') for line in new_content_lines])


1. Eof orphan lines leaked 🐞 Bug ≡ Correctness

decouple_and_convert_to_hunks_with_lines_numbers can append lines that occur after a trailing malformed "@@" header to the previous valid hunk at EOF, duplicating that hunk header and mis-numbering the appended lines. This happens because malformed headers are skipped but subsequent "+"/"-" lines are still buffered, and the EOF flush is triggered by prev_header_line rather than by an “active valid hunk” flag.

Agent Prompt

## Issue description `decouple_and_convert_to_hunks_with_lines_numbers()` can still emit orphan lines if a malformed `@@` header occurs near the end of the patch and is followed by content lines (e.g., `+...`, `-...`, or context). Those lines get buffered and then flushed at EOF under the last valid `prev_header_line`, duplicating the previous hunk header and producing incorrect line-numbered output. ## Issue Context You already skip malformed `@@` lines via `continue`, and you reset `new_content_lines`/`old_content_lines` on every `@@`. However, after a malformed `@@`, the function currently continues buffering subsequent lines even though it is no longer in a valid hunk, and the EOF flush uses `prev_header_line` + `new_content_lines` as the condition. ## Fix Focus Areas - pr_agent/algo/git_patch_processing.py[355-416] - tests/unittest/test_malformed_hunk_header.py[61-76] ## Suggested fix approach - Introduce an explicit state like `in_valid_hunk` (or reuse `match is not None` carefully) that is set to `True` only after a valid hunk header parse, and set to `False` when encountering a malformed `@@`. - Only append `+`/`-`/context lines to buffers when `in_valid_hunk` is `True`. - Alternatively (minimum change): when `match` is falsy in the `@@` branch, also clear `prev_header_line` (and optionally reset `start1/start2/...`) so the EOF finalization cannot attach any later buffered lines to a previous hunk. - Add a regression test: valid hunk -> malformed `@@` -> orphan lines -> EOF, asserting those orphan lines are not present and the previous hunk header isn’t duplicated.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

gvago added 3 commits April 14, 2026 18:37

qodo-free-for-open-source-projects bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(patch): guard against None match in hunk header extraction#2330

fix(patch): guard against None match in hunk header extraction#2330
gvago wants to merge 3 commits intoThe-PR-Agent:mainfrom
gvago:fix/hunk-header-parse-crash

gvago commented Apr 16, 2026

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 16, 2026

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

qodo-free-for-open-source-projects bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gvago commented Apr 16, 2026

Summary

Bug

Test plan

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 16, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

qodo-free-for-open-source-projects bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-free-for-open-source-projects bot commented Apr 16, 2026 •

edited

Loading