Skip to content

fix(gerrit): use action parameter for dispatch and clean up temp repos#2331

Open
gvago wants to merge 3 commits intoThe-PR-Agent:mainfrom
gvago:fix/gerrit-action-param-and-temp-cleanup
Open

fix(gerrit): use action parameter for dispatch and clean up temp repos#2331
gvago wants to merge 3 commits intoThe-PR-Agent:mainfrom
gvago:fix/gerrit-action-param-and-temp-cleanup

Conversation

@gvago
Copy link
Copy Markdown

@gvago gvago commented Apr 16, 2026

Summary

  • Action parameter ignored: The {action} path parameter in POST /api/v1/gerrit/{action} was captured but never used for command dispatch. Instead, item.msg was always passed as the command. Now the action enum value determines the command (/review, /describe, etc.), with item.msg only appended for the ask action. Also made msg optional with a default empty string since non-ask actions don't require it.
  • HTTPException returned instead of raised: return HTTPException(...) silently returns a 200 with the exception object serialized as JSON. Changed to raise HTTPException(...) so the 400 status code is actually sent.
  • Temp repo never cleaned up: prepare_repo() creates a temp directory via mkdtemp() but remove_initial_comment() had shutil.rmtree commented out. Added a cleanup() method to GerritProvider that safely removes the temp directory, called from the server's finally block after each request. A __del__ method serves as a safety net.
  • NEW: remove_initial_comment() no longer calls cleanup() — it's invoked during the request lifecycle while the repo is still needed. Cleanup only happens in the server's finally block and __del__.

Test plan

  • Verify POST /api/v1/gerrit/review with empty msg dispatches /review command (not / )
  • Verify POST /api/v1/gerrit/ask with msg: "what does this do?" dispatches /ask what does this do?
  • Verify POST /api/v1/gerrit/ask with empty msg returns 400
  • Verify temp directories under /tmp/ are cleaned up after request completes (via finally block), not during the request
  • Verify temp directories are cleaned up even when handle_request raises an exception

Replaces #2324 (lost push access to org branch).

gvago and others added 3 commits April 14, 2026 19:16
The Gerrit server had two bugs:

1. The {action} path parameter was captured but ignored - the server
   always dispatched based on item.msg content. Now the action parameter
   determines the command, with item.msg only used for the "ask" action
   to provide the question text. Also made item.msg optional (defaults
   to empty string) since non-ask actions don't need it, and fixed
   `return HTTPException` to `raise HTTPException`.

2. prepare_repo() created a temp directory via mkdtemp() that was never
   cleaned up - remove_initial_comment() had shutil.rmtree commented
   out. Added a cleanup() method to GerritProvider that removes the
   temp repo, called from the server's finally block after each request.
   Also added __del__ as a safety net, and wired remove_initial_comment()
   to delegate to cleanup().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- cleanup(): catch (OSError, PermissionError) instead of broad Exception
- gerrit_server finally block: catch (LookupError, RuntimeError) for
  starlette context access failures instead of bare Exception
- Add isinstance guard for git_providers dict before iterating
- Document that GerritProvider.__del__ acts as safety net when the
  provider is never stored in context (e.g. early failure)
…lly block

remove_initial_comment() is called during the request lifecycle while the
cloned repo is still needed by subsequent commands. Calling cleanup() here
deletes the repo mid-execution, breaking all downstream operations.

Revert to a no-op — actual cleanup happens in the server's finally block
and in __del__ as a safety net.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Fix Gerrit action parameter dispatch and temp repo cleanup

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Use action parameter to dispatch correct command instead of ignoring it
• Fix HTTPException to be raised instead of returned for proper error handling
• Clean up temporary repositories after request completes via finally block
• Add __del__ safety net for cleanup when provider not stored in context
Diagram
flowchart LR
  A["Gerrit Request<br/>with action param"] -->|"action determines<br/>command"| B["PRAgent<br/>handle_request"]
  B -->|"try block"| C["Process Request"]
  C -->|"finally block"| D["Cleanup temp repo"]
  D -->|"via cleanup()"| E["Remove temp directory"]
  F["GerritProvider<br/>__del__"] -->|"safety net"| E
Loading

Grey Divider

File Changes

1. pr_agent/git_providers/gerrit_provider.py 🐞 Bug fix +23/-2

Add cleanup methods and update comment removal

• Added cleanup() method to safely remove temporary cloned repository with error handling
• Added __del__() method as safety net for cleanup during garbage collection
• Updated remove_initial_comment() to be no-op with documentation explaining cleanup happens in
 server finally block

pr_agent/git_providers/gerrit_provider.py


2. pr_agent/servers/gerrit_server.py 🐞 Bug fix +37/-6

Use action parameter and add cleanup in finally block

• Changed msg field to optional with empty string default since non-ask actions don't require it
• Fixed return HTTPException to raise HTTPException for proper 400 status code
• Implemented action parameter dispatch logic: use action enum value for command, append item.msg
 only for ask action
• Added try/finally block to call cleanup() on GerritProvider after request completes
• Added defensive error handling for starlette context access with fallback to __del__ cleanup

pr_agent/servers/gerrit_server.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 16, 2026

Code Review by Qodo

🐞 Bugs (2)   📘 Rule violations (3)   📎 Requirement gaps (0)
🐞\ ☼ Reliability (2)
📘\ ≡ Correctness (1) ☼ Reliability (2)

Grey Divider


Action required

1. ask allows blank msg 📘
Description
The ask flow only checks if not item.msg, so a whitespace-only msg passes validation and
produces a command like /ask , which can cause unexpected behavior downstream. External inputs
should be normalized/validated (e.g., strip() then empty-check) before use.
Code

pr_agent/servers/gerrit_server.py[R41-51]

+    # For the "ask" action, the question must come from item.msg.
+    # For all other actions, use the action path parameter as the command.
    if action == Action.ask:
        if not item.msg:
-            return HTTPException(
+            raise HTTPException(
                status_code=400,
                detail="msg is required for ask command"
            )
-    await PRAgent().handle_request(
-        f"{item.project}:{item.refspec}",
-        f"/{item.msg.strip()}"
-    )
+        command = f"/{action.value} {item.msg.strip()}"
+    else:
+        command = f"/{action.value}"
Evidence
PR Compliance ID 19 requires defensive validation/normalization of external inputs before calling
methods/using values in logic. The new code validates item.msg without stripping, then uses
item.msg.strip() to build the command, allowing whitespace-only input to slip through.

pr_agent/servers/gerrit_server.py[41-51]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `ask` action currently accepts whitespace-only `msg` values because it checks `if not item.msg` before using `item.msg.strip()`.

## Issue Context
`item.msg` is external request input; whitespace-only content should be treated as empty and return HTTP 400.

## Fix Focus Areas
- pr_agent/servers/gerrit_server.py[41-51]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Ask temp repo leak 🐞
Description
The new finally cleanup only cleans GerritProvider instances stored in starlette_context, but
the ask command path constructs its provider via get_git_provider() (not
get_git_provider_with_context()), so its mkdtemp clone is not cleaned by the finally block.
Code

pr_agent/servers/gerrit_server.py[R70-75]

+        try:
+            git_providers = context.get("git_provider", {})
+            if isinstance(git_providers, dict):
+                for provider in git_providers.values():
+                    if isinstance(provider, GerritProvider):
+                        provider.cleanup()
Evidence
The cleanup loop only inspects context["git_provider"] values. The /ask command is routed to
PRQuestions, which instantiates its provider directly via get_git_provider()(pr_url), and
GerritProvider clones into a mkdtemp() directory in prepare_repo() during initialization; that
instance is never stored in context["git_provider"], so it will not be reached by the added
cleanup logic.

pr_agent/servers/gerrit_server.py[53-80]
pr_agent/agent/pr_agent.py[24-44]
pr_agent/tools/pr_questions.py[18-25]
pr_agent/git_providers/gerrit_provider.py[102-110]
pr_agent/git_providers/gerrit_provider.py[175-195]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Temp repo cleanup in `gerrit_server.handle_gerrit_request` only cleans providers stored in `starlette_context`. However, the `ask` action uses PRQuestions, which creates its GerritProvider via `get_git_provider()` and therefore bypasses the context cache, leaving the mkdtemp clone untracked and not cleaned in the request `finally`.

### Issue Context
- `handle_gerrit_request` cleans only `context.get("git_provider", {})` values.
- PRQuestions (used by `/ask`) constructs providers via `get_git_provider()(pr_url)`.

### Fix Focus Areas
- pr_agent/servers/gerrit_server.py[53-80]
- pr_agent/tools/pr_questions.py[18-25]
- pr_agent/tools/pr_line_questions.py[21-26]
- pr_agent/tools/pr_add_docs.py[20-25]
- pr_agent/tools/pr_config.py[17-22]
- pr_agent/tools/pr_generate_labels.py[27-32]
- pr_agent/tools/pr_update_changelog.py[22-27]

### Suggested change
In server/webhook execution paths, ensure tools use `get_git_provider_with_context(pr_url)` (or a helper that uses it when starlette_context is available) so the created GerritProvider instances are stored in context and reliably cleaned in the server `finally` block.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. __del__ catches broad Exception 📘
Description
__del__ uses except Exception: pass, which can mask unexpected errors during cleanup and makes
failures hard to detect/debug. The checklist requires targeted exception handling and preserving
context rather than broad exception suppression.
Code

pr_agent/git_providers/gerrit_provider.py[R402-407]

+    def __del__(self):
+        """Safety net: clean up temp repo if cleanup() was not called."""
+        try:
+            self.cleanup()
+        except Exception:
+            pass
Evidence
PR Compliance ID 20 prohibits broad except Exception patterns that mask bugs and reduce
debuggability. The new destructor suppresses all exceptions without logging or chaining.

pr_agent/git_providers/gerrit_provider.py[402-407]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GerritProvider.__del__` swallows all exceptions via `except Exception: pass`, which can hide real failures.

## Issue Context
The compliance checklist requires targeted exception handling and preserving context; destructors should avoid masking unexpected issues.

## Fix Focus Areas
- pr_agent/git_providers/gerrit_provider.py[402-407]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Gerrit changes lack tests 📘
Description
This PR changes request dispatch behavior and adds temp-repo cleanup logic, but no pytest tests were
added/updated to cover these new behaviors (e.g., action-based dispatch, ask validation, cleanup
in finally). This increases regression risk for the Gerrit server endpoint.
Code

pr_agent/servers/gerrit_server.py[R36-80]

@router.post("/api/v1/gerrit/{action}")
async def handle_gerrit_request(action: Action, item: Item):
    get_logger().debug("Received a Gerrit request")
    context["settings"] = copy.deepcopy(global_settings)

+    # For the "ask" action, the question must come from item.msg.
+    # For all other actions, use the action path parameter as the command.
    if action == Action.ask:
        if not item.msg:
-            return HTTPException(
+            raise HTTPException(
                status_code=400,
                detail="msg is required for ask command"
            )
-    await PRAgent().handle_request(
-        f"{item.project}:{item.refspec}",
-        f"/{item.msg.strip()}"
-    )
+        command = f"/{action.value} {item.msg.strip()}"
+    else:
+        command = f"/{action.value}"
+
+    try:
+        await PRAgent().handle_request(
+            f"{item.project}:{item.refspec}",
+            command
+        )
+    finally:
+        # Clean up the cloned temp repo created by GerritProvider.
+        # The provider is cached in the starlette context during
+        # get_git_provider_with_context().
+        #
+        # We guard against two failure modes:
+        #   1. The starlette context is inaccessible (e.g. middleware not
+        #      active) — caught by the outer try/except.
+        #   2. The provider was never stored in the context (e.g. an error
+        #      occurred before get_git_provider_with_context ran) — the
+        #      dict will simply be empty, and the GerritProvider.__del__
+        #      safety net handles cleanup on garbage collection.
+        try:
+            git_providers = context.get("git_provider", {})
+            if isinstance(git_providers, dict):
+                for provider in git_providers.values():
+                    if isinstance(provider, GerritProvider):
+                        provider.cleanup()
+        except (LookupError, RuntimeError):
+            get_logger().debug(
+                "Could not retrieve GerritProvider for cleanup; "
+                "temp directory will be cleaned up by __del__"
+            )
Evidence
PR Compliance ID 15 requires adding/updating pytest tests when behavior changes are introduced. The
diff shows behavioral changes in the Gerrit request handler and cleanup logic, without corresponding
test changes in the PR diff.

AGENTS.md
pr_agent/servers/gerrit_server.py[36-80]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Behavior changes were introduced in Gerrit request dispatch and cleanup, but there are no accompanying pytest tests.

## Issue Context
Add/update tests under `tests/unittest/`, `tests/e2e_tests/`, or `tests/health_test/` to cover:
- `/api/v1/gerrit/{action}` dispatch behavior
- `ask` requiring non-empty (post-strip) `msg`
- cleanup invoked from `finally`

## Fix Focus Areas
- pr_agent/servers/gerrit_server.py[36-80]
- pr_agent/git_providers/gerrit_provider.py[390-414]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Context cache key bug 🐞
Description
get_git_provider_with_context checks for a literal key "pr_url" and overwrites
context["git_provider"] instead of caching by the actual pr_url, causing repeated provider
creation and making the server cleanup able to see at most the last stored provider.
Code

pr_agent/servers/gerrit_server.py[R70-75]

+        try:
+            git_providers = context.get("git_provider", {})
+            if isinstance(git_providers, dict):
+                for provider in git_providers.values():
+                    if isinstance(provider, GerritProvider):
+                        provider.cleanup()
Evidence
The cache lookup uses context.get("git_provider", {}).get("pr_url", {}) and reads
context["git_provider"]["pr_url"], which will not match the insertion path that writes
context["git_provider"] = {pr_url: git_provider}. This means the function will keep creating new
providers and overwriting the dict, so the cleanup loop that iterates
context["git_provider"].values() cannot reliably clean all providers created during the request.

pr_agent/git_providers/init.py[40-65]
pr_agent/git_providers/utils.py[14-18]
pr_agent/servers/gerrit_server.py[58-76]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`get_git_provider_with_context` never hits its cache because it looks up the literal key `"pr_url"` and overwrites `context["git_provider"]` with a one-entry dict. This causes repeated provider construction (extra clones) and reduces the effectiveness of the request-level cleanup.

### Issue Context
The function stores providers as `{pr_url: git_provider}` but checks/reads `context["git_provider"]["pr_url"]`.

### Fix Focus Areas
- pr_agent/git_providers/__init__.py[40-65]

### Suggested change
- Replace the cache check with something like:
 - `providers = context.get("git_provider", {})`
 - `if is_context_env and isinstance(providers, dict) and pr_url in providers: return providers[pr_url]`
- When storing:
 - `providers = context.setdefault("git_provider", {})`
 - `providers[pr_url] = git_provider`
This preserves multiple providers in a request and makes cleanup via iterating `.values()` complete.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +41 to +51
# For the "ask" action, the question must come from item.msg.
# For all other actions, use the action path parameter as the command.
if action == Action.ask:
if not item.msg:
return HTTPException(
raise HTTPException(
status_code=400,
detail="msg is required for ask command"
)
await PRAgent().handle_request(
f"{item.project}:{item.refspec}",
f"/{item.msg.strip()}"
)
command = f"/{action.value} {item.msg.strip()}"
else:
command = f"/{action.value}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. ask allows blank msg 📘 Rule violation ≡ Correctness

The ask flow only checks if not item.msg, so a whitespace-only msg passes validation and
produces a command like /ask , which can cause unexpected behavior downstream. External inputs
should be normalized/validated (e.g., strip() then empty-check) before use.
Agent Prompt
## Issue description
The `ask` action currently accepts whitespace-only `msg` values because it checks `if not item.msg` before using `item.msg.strip()`.

## Issue Context
`item.msg` is external request input; whitespace-only content should be treated as empty and return HTTP 400.

## Fix Focus Areas
- pr_agent/servers/gerrit_server.py[41-51]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +70 to +75
try:
git_providers = context.get("git_provider", {})
if isinstance(git_providers, dict):
for provider in git_providers.values():
if isinstance(provider, GerritProvider):
provider.cleanup()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Ask temp repo leak 🐞 Bug ☼ Reliability

The new finally cleanup only cleans GerritProvider instances stored in starlette_context, but
the ask command path constructs its provider via get_git_provider() (not
get_git_provider_with_context()), so its mkdtemp clone is not cleaned by the finally block.
Agent Prompt
### Issue description
Temp repo cleanup in `gerrit_server.handle_gerrit_request` only cleans providers stored in `starlette_context`. However, the `ask` action uses PRQuestions, which creates its GerritProvider via `get_git_provider()` and therefore bypasses the context cache, leaving the mkdtemp clone untracked and not cleaned in the request `finally`.

### Issue Context
- `handle_gerrit_request` cleans only `context.get("git_provider", {})` values.
- PRQuestions (used by `/ask`) constructs providers via `get_git_provider()(pr_url)`.

### Fix Focus Areas
- pr_agent/servers/gerrit_server.py[53-80]
- pr_agent/tools/pr_questions.py[18-25]
- pr_agent/tools/pr_line_questions.py[21-26]
- pr_agent/tools/pr_add_docs.py[20-25]
- pr_agent/tools/pr_config.py[17-22]
- pr_agent/tools/pr_generate_labels.py[27-32]
- pr_agent/tools/pr_update_changelog.py[22-27]

### Suggested change
In server/webhook execution paths, ensure tools use `get_git_provider_with_context(pr_url)` (or a helper that uses it when starlette_context is available) so the created GerritProvider instances are stored in context and reliably cleaned in the server `finally` block.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant