fix(patch): guard against None match in hunk header extraction by gvago · Pull Request #2322 · The-PR-Agent/pr-agent

gvago · 2026-04-14T15:37:41Z

Summary

Guard extract_hunk_headers(match) calls against None match results in two functions within pr_agent/algo/git_patch_processing.py
In decouple_and_convert_to_hunks_with_lines_numbers (line ~379): moved extract_hunk_headers inside the existing if match: block and added continue for non-matching @@ lines
In extract_hunk_lines_from_patch (line ~434): added an explicit if not match: guard that sets skip_hunk = True and continues, preventing the crash

Bug

If a line starts with @@ but doesn't fully match the RE_HUNK_HEADER regex pattern, re.match() returns None. The code then calls extract_hunk_headers(None), which invokes None.groups() and raises AttributeError: 'NoneType' object has no attribute 'groups'.

This can occur with malformed or unusual diff output (e.g. conflict markers, truncated patches, or non-standard @@ lines).

Test plan

Verify existing tests pass (pytest tests/ -v)
Manually test with a patch containing a malformed @@ line (e.g. @@ invalid header @@) to confirm it no longer crashes
Verify normal patch processing still works correctly with valid hunk headers

🤖 Generated with Claude Code

In `decouple_and_convert_to_hunks_with_lines_numbers` and `extract_hunk_lines_from_patch`, the call to `extract_hunk_headers(match)` was outside proper `if match:` guards. If a line starts with `@@` but doesn't match `RE_HUNK_HEADER`, `match` is None, causing an `AttributeError` crash on `match.groups()`. Move the `extract_hunk_headers` call inside the match guard in both functions, and skip malformed hunk header lines gracefully.

qodo-free-for-open-source-projects · 2026-04-14T15:39:14Z

Review Summary by Qodo

Guard against None match in hunk header extraction

🐞 Bug fix

Walkthroughs

Description

• Guard extract_hunk_headers() calls against None match results
• Move extraction logic inside match validation blocks
• Skip malformed hunk header lines gracefully instead of crashing
• Prevent AttributeError when @@ lines don't match regex pattern

Diagram

flowchart LR
  A["Line starts with @@"] --> B{"RE_HUNK_HEADER.match()"}
  B -->|Valid match| C["Extract hunk headers"]
  B -->|None/Invalid| D["Skip line gracefully"]
  C --> E["Process hunk content"]
  D --> E

File Changes

1. pr_agent/algo/git_patch_processing.py 🐞 Bug fix +6/-2

Add None match guards in hunk header extraction

• Moved extract_hunk_headers(match) call inside if match: block in
 decouple_and_convert_to_hunks_with_lines_numbers() function
• Added else: continue clause to skip malformed hunk header lines
• Added explicit if not match: guard in extract_hunk_lines_from_patch() function that sets
 skip_hunk = True and continues
• Prevents AttributeError crash when @@ lines don't fully match the RE_HUNK_HEADER regex
 pattern

pr_agent/algo/git_patch_processing.py

qodo-free-for-open-source-projects · 2026-04-14T15:39:15Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (1) 📎 Requirement gaps (0)

🐞\ ≡ Correctness (2) ⭐ New (2)

📘\ ☼ Reliability (1)

1. Orphan lines join next hunk 🐞 ≡ ⭐

Description

In decouple_and_convert_to_hunks_with_lines_numbers(), a malformed @@ line sets match=None and
continues without clearing new_content_lines/old_content_lines, so any subsequent patch lines
are buffered and then incorrectly emitted under the next valid hunk header (wrong hunk association
and wrong line numbers). This can corrupt the diff context passed downstream to PR processing and
suggestions generation.

Code

pr_agent/algo/git_patch_processing.py[R355-362]

        if line.startswith('@@'):
            header_line = line
+            prev_match = match  # save previous match before overwriting
            match = RE_HUNK_HEADER.match(line)
-            if match and (new_content_lines or old_content_lines):  # found a new hunk, split the previous lines
+            if prev_match and (new_content_lines or old_content_lines):  # flush the previous hunk
                if prev_header_line:
                    patch_with_lines_str += f'\n{prev_header_line}\n'
                is_plus_lines = is_minus_lines = False

Evidence

When a malformed header is encountered, the code takes the else: continue path, leaving
match=None and not resetting the content buffers; meanwhile, non-header lines always append into
new_content_lines/old_content_lines. When the next valid header arrives, prev_match is None
(because the malformed header overwrote match), so the flush block doesn’t run and the orphan
buffered lines remain and will be emitted as part of the next hunk with that hunk’s start2 line
numbering. This output is used to build the patch context given to PR processing.

pr_agent/algo/git_patch_processing.py[344-395]
pr_agent/algo/pr_processing.py[183-196]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`decouple_and_convert_to_hunks_with_lines_numbers()` can mis-attribute diff lines that appear after a malformed `@@` header to the next valid hunk because it `continue`s without clearing `new_content_lines` / `old_content_lines`.

### Issue Context
A malformed `@@` header sets `match=None`. Any subsequent `+`/`-`/context lines are still appended to the buffers, and the next valid header won’t flush them because `prev_match` is `None`.

### Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[355-395]

### Implementation notes
- When encountering any `@@` line, after flushing the previous valid hunk (if any), reset `new_content_lines` and `old_content_lines` regardless of whether the previous header matched.
- Additionally, when `match` is `None` (malformed header), ensure buffers are cleared (and optionally reset `start1/start2`) so subsequent non-header lines are treated as orphan content and not attached to the next valid hunk.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Malformed hunk handling untested 📘 ☼

Description

This PR changes patch parsing behavior to skip malformed @@ hunk headers, but no corresponding
pytest coverage is added/updated in this change set. Without tests, regressions could reintroduce
crashes or silently drop hunks when encountering unusual diff output.

Code

pr_agent/algo/git_patch_processing.py[R379-380]

+            else:
+                continue  # skip lines that start with @@ but don't match the hunk header pattern

Evidence

PR Compliance ID 14 requires adding/updating pytest coverage when behavior changes. The diff shows
new guards that alter control flow for malformed hunk headers (skipping/marking hunks as skipped),
but the PR diff contains no test additions exercising these new branches.

AGENTS.md
pr_agent/algo/git_patch_processing.py[376-380]
pr_agent/algo/git_patch_processing.py[433-436]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new guard logic for malformed `@@` hunk headers changes runtime behavior (skip/ignore malformed hunks) but is not covered by tests.
## Issue Context
Previously, malformed `@@` lines could cause `extract_hunk_headers(None)` crashes. The fix now skips those headers; add unit tests to ensure (1) no exception is raised and (2) behavior is stable for both valid and malformed headers.
## Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[376-380]
- pr_agent/algo/git_patch_processing.py[433-436]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~Last hunk can drop~~ ☑ 🐞 ≡

Description

decouple_and_convert_to_hunks_with_lines_numbers() assigns match = RE_HUNK_HEADER.match(line)
before the new else: continue, so a malformed trailing @@ line leaves match=None while
new_content_lines/old_content_lines still contain the last valid hunk. The final flush condition
if match and new_content_lines: then fails, causing the function to return without emitting the
buffered hunk (or to merge lines between malformed and next valid header into the prior hunk).

Code

pr_agent/algo/git_patch_processing.py[R375-380]

              old_content_lines = []
          if match:
              prev_header_line = header_line
-
-            section_header, size1, size2, start1, start2 = extract_hunk_headers(match)
+                section_header, size1, size2, start1, start2 = extract_hunk_headers(match)
+            else:
+                continue  # skip lines that start with @@ but don't match the hunk header pattern

Evidence

Within the loop, match is overwritten on every @@ line (match = RE_HUNK_HEADER.match(line)),
and on a non-matching header the new else: continue skips flushing/resetting buffers. After the
loop, the function only emits the final buffered hunk when match is truthy (`if match and
new_content_lines:), so a trailing malformed @@` header (or a malformed header after the last
flush) can prevent any final emission despite non-empty buffers.

pr_agent/algo/git_patch_processing.py[351-381]
pr_agent/algo/git_patch_processing.py[395-411]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`decouple_and_convert_to_hunks_with_lines_numbers()` can drop the final buffered hunk when a malformed `@@` line occurs after a valid hunk. This happens because `match`/`header_line` are assigned before validating the hunk header, and the end-of-function flush is gated by `if match and new_content_lines:`.
### Issue Context
The PR adds `else: continue` for non-matching `@@` lines, which prevents the crash, but the function’s end-of-loop flush uses the *last assigned* `match` value. A malformed trailing `@@` sets `match=None`, leaving buffered lines unflushed.
### Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[351-412]
### Suggested fix
- Don’t clobber the outer `match` (and `header_line`) unless the hunk header regex matched.
- e.g., use a local `header_match = RE_HUNK_HEADER.match(line)` and `if not header_match: continue` *before* assigning `match = header_match` and `header_line = line`.
- Ensure the “finishing last hunk” logic flushes based on the last **valid** header seen (or on `prev_header_line`/`start2 != -1`) rather than on a potentially overwritten `match` from a malformed header line.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Trailing deletion hunk skipped 🐞 ≡ ⭐

Description

The last-hunk finalization in decouple_and_convert_to_hunks_with_lines_numbers() only runs when
new_content_lines is non-empty, so a patch ending with a deletion-only hunk (only - lines) will
be dropped from the output. This loses diff context for deletions at EOF.

Code

pr_agent/algo/git_patch_processing.py[R396-403]

+    # finishing last hunk — use prev_header_line (not match/header_line) because
+    # match may have been set to None by a trailing malformed @@ line, and
+    # header_line may point to that malformed line instead of the last valid hunk
+    if prev_header_line and new_content_lines:
+        patch_with_lines_str += f'\n{prev_header_line}\n'
        is_plus_lines = is_minus_lines = False
        if new_content_lines:
            is_plus_lines = any([line.startswith('+') for line in new_content_lines])

Evidence

Deletions are collected into old_content_lines, but last-hunk finalization is gated by `if
prev_header_line and new_content_lines:`, so a deletion-only hunk at the end of the patch never gets
emitted even though old_content_lines contains changes.

pr_agent/algo/git_patch_processing.py[383-415]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The last-hunk finalization block skips trailing deletion-only hunks because it requires `new_content_lines` to be non-empty.

### Issue Context
Deletion-only hunks populate `old_content_lines` but may leave `new_content_lines` empty; the current gate prevents emitting `__old hunk__` for the final hunk.

### Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[396-414]

### Implementation notes
- Change the finalization condition to include either side:
 - `if prev_header_line and (new_content_lines or old_content_lines):`
- Keep the existing `is_minus_lines` logic so deletion hunks still produce `__old hunk__` output.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

Previous review results

Review updated until commit 89fbb48

Results up to commit N/A

🐞 Bugs (0)
📘 Rule violations (1)
📎 Requirement gaps (0)

1. Malformed hunk handling untested 📘 ☼

Description

This PR changes patch parsing behavior to skip malformed @@ hunk headers, but no corresponding
pytest coverage is added/updated in this change set. Without tests, regressions could reintroduce
crashes or silently drop hunks when encountering unusual diff output.

Code

pr_agent/algo/git_patch_processing.py[R379-380]

+            else:
+                continue  # skip lines that start with @@ but don't match the hunk header pattern

Evidence

PR Compliance ID 14 requires adding/updating pytest coverage when behavior changes. The diff shows
new guards that alter control flow for malformed hunk headers (skipping/marking hunks as skipped),
but the PR diff contains no test additions exercising these new branches.

AGENTS.md
pr_agent/algo/git_patch_processing.py[376-380]
pr_agent/algo/git_patch_processing.py[433-436]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new guard logic for malformed `@@` hunk headers changes runtime behavior (skip/ignore malformed hunks) but is not covered by tests.
## Issue Context
Previously, malformed `@@` lines could cause `extract_hunk_headers(None)` crashes. The fix now skips those headers; add unit tests to ensure (1) no exception is raised and (2) behavior is stable for both valid and malformed headers.
## Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[376-380]
- pr_agent/algo/git_patch_processing.py[433-436]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Last hunk can drop~~ ☑ 🐞 ≡

Description

decouple_and_convert_to_hunks_with_lines_numbers() assigns match = RE_HUNK_HEADER.match(line)
before the new else: continue, so a malformed trailing @@ line leaves match=None while
new_content_lines/old_content_lines still contain the last valid hunk. The final flush condition
if match and new_content_lines: then fails, causing the function to return without emitting the
buffered hunk (or to merge lines between malformed and next valid header into the prior hunk).

Code

pr_agent/algo/git_patch_processing.py[R375-380]

               old_content_lines = []
           if match:
               prev_header_line = header_line
-
-            section_header, size1, size2, start1, start2 = extract_hunk_headers(match)
+                section_header, size1, size2, start1, start2 = extract_hunk_headers(match)
+            else:
+                continue  # skip lines that start with @@ but don't match the hunk header pattern

Evidence

Within the loop, match is overwritten on every @@ line (match = RE_HUNK_HEADER.match(line)),
and on a non-matching header the new else: continue skips flushing/resetting buffers. After the
loop, the function only emits the final buffered hunk when match is truthy (`if match and
new_content_lines:), so a trailing malformed @@` header (or a malformed header after the last
flush) can prevent any final emission despite non-empty buffers.

pr_agent/algo/git_patch_processing.py[351-381]
pr_agent/algo/git_patch_processing.py[395-411]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`decouple_and_convert_to_hunks_with_lines_numbers()` can drop the final buffered hunk when a malformed `@@` line occurs after a valid hunk. This happens because `match`/`header_line` are assigned before validating the hunk header, and the end-of-function flush is gated by `if match and new_content_lines:`.
### Issue Context
The PR adds `else: continue` for non-matching `@@` lines, which prevents the crash, but the function’s end-of-loop flush uses the *last assigned* `match` value. A malformed trailing `@@` sets `match=None`, leaving buffered lines unflushed.
### Fix Focus Areas
- pr_agent/algo/git_patch_processing.py[351-412]
### Suggested fix
- Don’t clobber the outer `match` (and `header_line`) unless the hunk header regex matched.
- e.g., use a local `header_match = RE_HUNK_HEADER.match(line)` and `if not header_match: continue` *before* assigning `match = header_match` and `header_line = line`.
- Ensure the “finishing last hunk” logic flushes based on the last **valid** header seen (or on `prev_header_line`/`start2 != -1`) rather than on a potentially overwritten `match` from a malformed header line.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-free-for-open-source-projects · 2026-04-14T15:43:02Z

+            else:
+                continue  # skip lines that start with @@ but don't match the hunk header pattern


1. Malformed hunk handling untested 📘 Rule violation ☼ Reliability

This PR changes patch parsing behavior to skip malformed @@ hunk headers, but no corresponding pytest coverage is added/updated in this change set. Without tests, regressions could reintroduce crashes or silently drop hunks when encountering unusual diff output.

Agent Prompt

## Issue description The new guard logic for malformed `@@` hunk headers changes runtime behavior (skip/ignore malformed hunks) but is not covered by tests. ## Issue Context Previously, malformed `@@` lines could cause `extract_hunk_headers(None)` crashes. The fix now skips those headers; add unit tests to ensure (1) no exception is raised and (2) behavior is stable for both valid and malformed headers. ## Fix Focus Areas - pr_agent/algo/git_patch_processing.py[376-380] - pr_agent/algo/git_patch_processing.py[433-436]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

… @@ lines The decouple_and_convert_to_hunks_with_lines_numbers() function overwrote `match` with the new RE_HUNK_HEADER result before checking whether the previous hunk needed to be finalized. When a malformed @@ line produced match=None, the flush condition `if match and (new/old_content_lines)` was False, silently dropping the previous hunk's content. Fix: save `prev_match` before overwriting and use it for the flush decision. Also use `prev_header_line` instead of `match`/`header_line` in the post-loop finalization, so a trailing malformed @@ cannot suppress the last valid hunk. Adds 7 unit tests covering malformed @@ scenarios: crash safety, content preservation, trailing malformed headers, line-number accuracy, all-malformed patches, and deletion-only hunks.

qodo-free-for-open-source-projects · 2026-04-15T08:36:43Z

Persistent review updated to latest commit 89fbb48

qodo-free-for-open-source-projects · 2026-04-15T08:40:42Z

        if line.startswith('@@'):
            header_line = line
+            prev_match = match  # save previous match before overwriting
            match = RE_HUNK_HEADER.match(line)
-            if match and (new_content_lines or old_content_lines):  # found a new hunk, split the previous lines
+            if prev_match and (new_content_lines or old_content_lines):  # flush the previous hunk
                if prev_header_line:
                    patch_with_lines_str += f'\n{prev_header_line}\n'
                is_plus_lines = is_minus_lines = False


1. Orphan lines join next hunk 🐞 Bug ≡ Correctness

In decouple_and_convert_to_hunks_with_lines_numbers(), a malformed @@ line sets match=None and continues without clearing new_content_lines/old_content_lines, so any subsequent patch lines are buffered and then incorrectly emitted under the next valid hunk header (wrong hunk association and wrong line numbers). This can corrupt the diff context passed downstream to PR processing and suggestions generation.

Agent Prompt

### Issue description `decouple_and_convert_to_hunks_with_lines_numbers()` can mis-attribute diff lines that appear after a malformed `@@` header to the next valid hunk because it `continue`s without clearing `new_content_lines` / `old_content_lines`. ### Issue Context A malformed `@@` header sets `match=None`. Any subsequent `+`/`-`/context lines are still appended to the buffers, and the next valid header won’t flush them because `prev_match` is `None`. ### Fix Focus Areas - pr_agent/algo/git_patch_processing.py[355-395] ### Implementation notes - When encountering any `@@` line, after flushing the previous valid hunk (if any), reset `new_content_lines` and `old_content_lines` regardless of whether the previous header matched. - Additionally, when `match` is `None` (malformed header), ensure buffers are cleared (and optionally reset `start1/start2`) so subsequent non-header lines are treated as orphan content and not attached to the next valid hunk.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

gvago · 2026-04-16T08:46:28Z

Closing to reopen from fork (lost push access to org branch).

qodo-free-for-open-source-projects bot reviewed Apr 14, 2026

View reviewed changes

qodo-free-for-open-source-projects bot reviewed Apr 15, 2026

View reviewed changes

gvago closed this Apr 16, 2026

gvago mentioned this pull request Apr 16, 2026

fix(patch): guard against None match in hunk header extraction #2330

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(patch): guard against None match in hunk header extraction#2322

fix(patch): guard against None match in hunk header extraction#2322
gvago wants to merge 2 commits intomainfrom
fix/hunk-header-parse-crash

gvago commented Apr 14, 2026

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 14, 2026

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

qodo-free-for-open-source-projects bot Apr 14, 2026

Uh oh!

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

qodo-free-for-open-source-projects bot Apr 15, 2026

Uh oh!

gvago commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		else:
		continue # skip lines that start with @@ but don't match the hunk header pattern

Conversation

gvago commented Apr 14, 2026

Summary

Bug

Test plan

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 14, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Previous review results

Uh oh!

qodo-free-for-open-source-projects bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qodo-free-for-open-source-projects bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-free-for-open-source-projects bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

gvago commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-free-for-open-source-projects bot commented Apr 14, 2026 •

edited

Loading

qodo-free-for-open-source-projects bot commented Apr 15, 2026 •

edited

Loading