fix(gitea): fetch PR-specific commits instead of all repo commits#2309
fix(gitea): fetch PR-specific commits instead of all repo commits#2309lawrence3699 wants to merge 1 commit intoThe-PR-Agent:mainfrom
Conversation
Review Summary by QodoFix Gitea provider to fetch PR-specific commits instead of all repo commits
WalkthroughsDescription• Changed PR commits endpoint from repo-level to PR-specific endpoint • Replaced list_all_commits() with get_pr_commits() API call • Added proper handling for empty commits list with SimpleNamespace wrapper • Added regression test verifying PR-specific commits are fetched Diagramflowchart LR
A["GiteaProvider.__init__"] -->|"Previously called"| B["repo_api.list_all_commits<br/>all repo commits"]
A -->|"Now calls"| C["repo_api.get_pr_commits<br/>PR-specific commits"]
C --> D["SimpleNamespace wrapper<br/>for last_commit"]
D --> E["Proper null handling<br/>if no commits"]
File Changes1. pr_agent/git_providers/gitea_provider.py
|
Code Review by Qodo
|
| self.pr_commits = pr_commits_data if pr_commits_data else [] | ||
| if self.pr_commits: | ||
| self.last_commit = SimpleNamespace(**self.pr_commits[-1]) | ||
| else: | ||
| self.last_commit = None |
There was a problem hiding this comment.
1. last_commit uses [-1] 📎 Requirement gap ≡ Correctness
get_pr_commits() returns commits newest-first per the Gitea PR commits endpoint behavior, but __init__ selects self.pr_commits[-1], which points to the oldest commit. This can cause commit-dependent behaviors to reference a non-head SHA and fail to detect new PR commits correctly.
Agent Prompt
## Issue description
`GiteaProvider.__init__` sets `last_commit` from `self.pr_commits[-1]`, but the PR commits endpoint returns commits newest-first; this makes `last_commit` the oldest commit instead of the PR head.
## Issue Context
Downstream features use `last_commit`/`last_commit_id` as the PR head SHA for URLs, change detection, inline comments, and file content retrieval.
## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[95-99]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| self.pr_commits = pr_commits_data if pr_commits_data else [] | ||
| if self.pr_commits: | ||
| self.last_commit = SimpleNamespace(**self.pr_commits[-1]) | ||
| else: | ||
| self.last_commit = None | ||
| self.last_commit_id = self.last_commit | ||
| self.base_sha = self.pr.base.sha if self.pr.base.sha else "" |
There was a problem hiding this comment.
2. Null last_commit crash 🐞 Bug ☼ Reliability
GiteaProvider.__init__ can set self.last_commit/self.last_commit_id to None when get_pr_commits() returns an empty list, but methods like get_latest_commit_url() and _get_file_content_from_latest_commit() dereference .html_url/.sha unconditionally, causing AttributeError at runtime. This is reachable because RepoApi.get_pr_commits() returns [] on API errors/exceptions, so a transient Gitea/API issue will crash later tool flows (e.g., PR description header generation).
Agent Prompt
### Issue description
`GiteaProvider.__init__` can leave `self.last_commit` as `None` when PR commits cannot be fetched (empty list), but later methods assume `last_commit`/`last_commit_id` always exist and unconditionally access `.sha`/`.html_url`, leading to `AttributeError` crashes.
### Issue Context
`RepoApi.get_pr_commits()` returns `[]` on exceptions, making the empty-commits path reachable during transient API/network failures.
### Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[87-103]
- pr_agent/git_providers/gitea_provider.py[232-234]
- pr_agent/git_providers/gitea_provider.py[442-448]
- pr_agent/git_providers/gitea_provider.py[1019-1050]
### Suggested fix
- In `__init__`, if `get_pr_commits()` returns empty:
- Prefer a safe fallback: build `last_commit` from `self.pr.head.sha` (already available as `self.sha`) and (if needed) synthesize an `html_url`, or
- Fail fast with a clear exception/log and stop initialization (instead of leaving `last_commit=None`).
- Add guards in `get_latest_commit_url()` and `_get_file_content_from_latest_commit()` to avoid dereferencing `None` (e.g., return "" / use `self.sha` fallback) so downstream tools don’t crash.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Pull request overview
Fixes GiteaProvider initialization to fetch PR-specific commits (instead of repo default-branch commits) so last_commit/last_commit_id reflects the PR head and downstream behaviors (inline comments, persistent review updates, file-content reads) anchor to the correct commit.
Changes:
- Switch
GiteaProvider.__init__from repo-level commits toget_pr_commits(owner, repo, pr_number). - Wrap the selected commit dict in
SimpleNamespaceto preserve.sha/.html_urlattribute access. - Add a regression unit test asserting the PR commits endpoint is called and
last_commit.shamatches expectations.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pr_agent/git_providers/gitea_provider.py | Uses PR commits endpoint during provider init and adapts JSON commit shape for existing attribute-based code. |
| tests/unittest/test_gitea_provider.py | Adds regression coverage ensuring init calls the PR commits endpoint and sets last_commit appropriately. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pr_commits_data = self.repo_api.get_pr_commits( | ||
| owner=self.owner, | ||
| repo=self.repo | ||
| repo=self.repo, | ||
| pr_number=self.pr_number | ||
| ) | ||
| self.last_commit = self.pr_commits[-1] | ||
| self.pr_commits = pr_commits_data if pr_commits_data else [] | ||
| if self.pr_commits: | ||
| self.last_commit = SimpleNamespace(**self.pr_commits[-1]) | ||
| else: | ||
| self.last_commit = None | ||
| self.last_commit_id = self.last_commit |
There was a problem hiding this comment.
last_commit is derived from self.pr_commits[-1], which assumes the PR commits endpoint returns commits in oldest→newest order. If Gitea returns PR commits newest-first (as reported in #2206), this will still select the wrong commit. Prefer selecting the commit whose sha matches self.pr.head.sha (already stored in self.sha) rather than relying on list ordering (with a sensible fallback if not found).
| pr_commits_json = json.dumps([ | ||
| {"sha": "older_commit_sha", "html_url": "https://gitea.example.com/owner/repo/commit/older_commit_sha"}, | ||
| {"sha": "latest_commit_sha", "html_url": "https://gitea.example.com/owner/repo/commit/latest_commit_sha"}, | ||
| ]) | ||
|
|
||
| mock_pr = MagicMock() | ||
| mock_pr.head.sha = "latest_commit_sha" | ||
| mock_pr.base.sha = "base_sha" | ||
| mock_pr.base.ref = "main" | ||
|
|
||
| def call_api_side_effect(path, method, **kwargs): | ||
| called_paths.append(path) | ||
| mock_resp = MagicMock() | ||
| if 'pulls' in path and 'files' in path: | ||
| mock_resp.data = BytesIO(b'[]') | ||
| elif 'pulls' in path and 'commits' in path: | ||
| mock_resp.data = BytesIO(pr_commits_json.encode()) | ||
| elif path.endswith('.diff'): | ||
| mock_resp.data = BytesIO(b'') | ||
| else: | ||
| mock_resp.data = BytesIO(b'{}') | ||
| return mock_resp | ||
|
|
||
| mock_api_client.call_api.side_effect = call_api_side_effect | ||
| mock_filter.side_effect = lambda files, **kw: files | ||
|
|
||
| # Mock repo_get_pull_request to return our PR object | ||
| with patch('pr_agent.git_providers.gitea_provider.giteapy.RepositoryApi.repo_get_pull_request', return_value=mock_pr): | ||
| from pr_agent.git_providers.gitea_provider import GiteaProvider | ||
| provider = GiteaProvider("https://gitea.example.com/owner/repo/pulls/42") | ||
|
|
||
| # Verify PR-specific commits endpoint was called (not repo-level commits) | ||
| pr_commits_calls = [p for p in called_paths if 'commits' in p] | ||
| assert any('/pulls/' in p and '/commits' in p for p in pr_commits_calls), \ | ||
| f"Expected PR-specific commits endpoint, got: {pr_commits_calls}" | ||
| assert not any(p.endswith('/commits') and '/pulls/' not in p for p in pr_commits_calls), \ | ||
| f"Should not call repo-level commits endpoint, got: {pr_commits_calls}" | ||
|
|
||
| # Verify last_commit has the latest PR commit's SHA | ||
| assert provider.last_commit is not None | ||
| assert provider.last_commit.sha == "latest_commit_sha" | ||
| assert provider.last_commit.html_url == "https://gitea.example.com/owner/repo/commit/latest_commit_sha" |
There was a problem hiding this comment.
This regression test encodes an ordering assumption by returning commits as [older, latest] and asserting last_commit.sha == "latest_commit_sha". If the real Gitea endpoint returns commits newest-first, this test will pass while production behavior is wrong (or vice versa). Make the test order-independent by asserting provider.last_commit.sha == mock_pr.head.sha and/or by varying the mocked commit order.
PR Type
Bug fix
Description
GiteaProvider.__init__callslist_all_commits(owner, repo)which hitsGET /repos/{owner}/{repo}/commits— the repo-level commits endpoint. This returns commits from the default branch, not from the PR branch. As a result,self.last_commitpoints at the wrong commit.The same file already has
get_pr_commits(owner, repo, pr_number)which correctly callsGET /repos/{owner}/{repo}/pulls/{pr_number}/commits, and it's already used byget_commit_messages()(line 402). This fix switches__init__to use that method too.Before:
last_commitis the oldest commit on the default branch → inline comments posted against wrong commit, persistent reviews never detect new changes,_get_file_content_from_latest_commit()fetches wrong content.After:
last_commitis the latest commit on the PR branch, consistent with the GitHub provider.Since
get_pr_commitsreturns raw JSON dicts (unlike the giteapy model objects fromlist_all_commits), the fix wraps the commit dict in aSimpleNamespaceso existing.sha/.html_urlattribute access continues to work.Added a regression test that verifies the PR-specific endpoint is called and
last_commit.shamatches the expected value.Fixes #2206