Mock OpenAIRE API in test to avoid test failure due to stale fixtures#39
Mock OpenAIRE API in test to avoid test failure due to stale fixtures#39ssheikholeslami merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the test_openaire_v2 test to use mocked API responses instead of calling the live OpenAIRE API. This prevents test failures caused by changes in the API response format over time.
Changes:
- Added
make_dummy_get_success()factory function to create mock HTTP responses from fixture data - Updated
test_openaire_v2to usemonkeypatchto mocksession.get()with fixture data - Followed the existing pattern established by
test_api_403_response
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert str(e.value) == expected | ||
|
|
||
| def test_openaire_v2(self, session): | ||
| def test_openaire_v2(self, session, monkeypatch): |
There was a problem hiding this comment.
The test class is named TestMetadataFetcher403, which suggests it's specifically for testing 403 responses. However, test_openaire_v2 is testing a successful API response (200 status code), not a 403 error. Consider either:
- Renaming the class to something more general like
TestMetadataFetcherorTestOpenAIREMetadataFetcher, or - Moving
test_openaire_v2to a separate class with an appropriate name likeTestMetadataFetcherSuccess
This would make the test organization clearer and more consistent.
| fixture_path = os.path.join("tests", "fixtures", "openaire_v2.json") | ||
| monkeypatch.setattr(session, "get", make_dummy_get_success(fixture_path)) | ||
|
|
||
| fetcher = MetadataFetcher(session=session) | ||
| actual = fetcher.get_metadata_from_openaire("10.5281/zenodo.4650794") | ||
| with open( | ||
| os.path.join("tests", "fixtures", "openaire_v2.json"), "r" | ||
| ) as f: | ||
|
|
||
| with open(fixture_path, "r") as f: | ||
| expected = load(f) |
There was a problem hiding this comment.
The fixture file is being read twice: once when creating the mock function with make_dummy_get_success(fixture_path) (line 74), and again when loading the expected data (lines 79-80). This is inefficient and could be simplified.
Consider refactoring the test to reuse fixture_data from the factory or restructure the test to read the file only once. For example, you could read the fixture once, use it to create the mock, and also use it as the expected value without re-reading the file.
| with open(fixture_path, "r") as f: | ||
| fixture_data = load(f) | ||
|
|
||
| def dummy_get_success(url, headers=None): |
There was a problem hiding this comment.
The parameter signature dummy_get_success(url, headers=None) is inconsistent with dummy_get_403(url, headers) which requires the headers parameter. While this works because headers=None makes it optional, for consistency with the existing mock function pattern (dummy_get_403 at line 29), consider making the signature match: dummy_get_success(url, headers).
This ensures both mock functions have the same signature, making them more interchangeable and reducing potential confusion.
| def dummy_get_success(url, headers=None): | |
| def dummy_get_success(url, headers): |
|
@ssheikholeslami - the test is failing, due to the expired Github Action REFRESH_TOKEN environment variable. I've just updated it - so lets see if the CI passes... |
There was a problem hiding this comment.
I'm not sure your proposed solutions solves the problem, which is more of a design issue! In short, the tests rely on a valid REFRESH_TOKEN being available.
If no REFRESH_TOKEN is defined in the environment where you run the tests, then one test (which actually queries the OpenAire API) fails with the following error:
hatch test
====================================================== test session starts =======================================================
platform linux -- Python 3.12.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/will/repositories/research_index_backend
configfile: pyproject.toml
plugins: rerunfailures-14.0, xdist-3.8.0, mock-3.15.1
collected 33 items
tests/test_dois.py ......... [ 27%]
tests/test_metadata.py .F.......... [ 63%]
tests/test_parser.py ......... [ 90%]
tests/test_utilities.py ... [100%]
============================================================ FAILURES ============================================================
____________________________________________ TestMetadataFetcher403.test_openaire_v2 _____________________________________________
self = <research_index_backend.config.Config object at 0x701eedba0860>
def _get_personal_token(self) -> str:
"""Get personal token by providing a refresh token"""
if refresh_token := os.getenv("REFRESH_TOKEN"):
logger.info("Found refresh token. Obtaining personal token.")
query = f"?refreshToken={refresh_token}"
try:
response = requests.get(self.openaire_token_endpoint + query)
logger.info(f"Status code: {response.status_code}")
> response.raise_for_status()
src/research_index_backend/config.py:76:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Response [400]>
def raise_for_status(self):
"""Raises :class:`HTTPError`, if one occurred."""
http_error_msg = ""
if isinstance(self.reason, bytes):
# We attempt to decode utf-8 first because some servers
# choose to localize their reason strings. If the string
# isn't utf-8, we fall back to iso-8859-1 for all other
# encodings. (See PR #3538)
try:
reason = self.reason.decode("utf-8")
except UnicodeDecodeError:
reason = self.reason.decode("iso-8859-1")
else:
reason = self.reason
if 400 <= self.status_code < 500:
http_error_msg = (
f"{self.status_code} Client Error: {reason} for url: {self.url}"
)
elif 500 <= self.status_code < 600:
http_error_msg = (
f"{self.status_code} Server Error: {reason} for url: {self.url}"
)
if http_error_msg:
> raise HTTPError(http_error_msg, response=self)
E requests.exceptions.HTTPError: 400 Client Error: for url: https://services.openaire.eu/uoa-user-management/api/users/getAccessToken?refreshToken=dljadsfkljadsf
../../.local/share/hatch/env/virtual/research-index-backend/4xkJ2Z4n/hatch-test.py3.12/lib/python3.12/site-packages/requests/models.py:1026: HTTPError
During handling of the above exception, another exception occurred:
self = <test_metadata.TestMetadataFetcher403 object at 0x701eed220c20>
session = <CachedSession(cache=<SQLiteCache(name=http_cache)>, settings=CacheSettings(expire_after=-1))>
monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x701eed2bc9e0>
def test_openaire_v2(self, session, monkeypatch):
fixture_path = os.path.join("tests", "fixtures", "openaire_v2.json")
monkeypatch.setattr(session, "get", make_dummy_get_success(fixture_path))
fetcher = MetadataFetcher(session=session)
> actual = fetcher.get_metadata_from_openaire("10.5281/zenodo.4650794")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_metadata.py:77:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/research_index_backend/get_metadata.py:43: in get_metadata_from_openaire
headers = {"Authorization": f"Bearer {config.token}"}
^^^^^^^^^^^^
src/research_index_backend/config.py:55: in token
self._token = self._get_personal_token()
^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <research_index_backend.config.Config object at 0x701eedba0860>
def _get_personal_token(self) -> str:
"""Get personal token by providing a refresh token"""
if refresh_token := os.getenv("REFRESH_TOKEN"):
logger.info("Found refresh token. Obtaining personal token.")
query = f"?refreshToken={refresh_token}"
try:
response = requests.get(self.openaire_token_endpoint + query)
logger.info(f"Status code: {response.status_code}")
response.raise_for_status()
except requests.exceptions.HTTPError:
if 400 <= response.status_code < 500:
> raise ValueError(
"OpenAire refresh token is invalid or expired. Please update token and try again."
E ValueError: OpenAire refresh token is invalid or expired. Please update token and try again.
src/research_index_backend/config.py:79: ValueError
==================================================== short test summary info =====================================================
FAILED tests/test_metadata.py::TestMetadataFetcher403::test_openaire_v2 - ValueError: OpenAire refresh token is invalid or expired. Please update token and try again.
================================================== 1 failed, 32 passed in 8.36s ==================================================I think the reason is because the call on line 77 of tests/test_metadata.py to get_metadata.MetaDataFetcher imports the config module. The config module checks for the availability of the REFRESH_TOKEN and if it is not available, raises an error.
I'm not sure how to get around this - a redesign is probably in order. Using a valid REFRESH_TOKEN is the short-cut!
|
Yeah I think that's a separate issue and we can later redesign and refactor the tests, but the current test is failing even with a valid REFRESH_TOKEN, due to an AssertionError: I think it's because of some changes to the API response since the time you created the fixture |
willu47
left a comment
There was a problem hiding this comment.
Initially, I was unable to recreate the issue, which is why I got confused and went down the API key rabbit hole...
However, I realised that the problem was the use of a CachedSession, which caches API calls. I still had my old cache, so the fixture matched the cached version of the response. When I removed the cache, then this issue popped up.
I'll make new issues for both the API key and the CachedSession...
Otherwise, this is ready for merging.
|
Thanks! once I get more comfortable with the codebase I can start by refactoring and redesigning the tests. |
Summary
test_openaire_v2to prevent test failures due to changes in the response format of OpenAIRE API over timeProblem
The
test_openaire_v2test was calling the live OpenAIRE API and comparing the response to a static fixture file from a few months ago. This made the test fail since the it expected an exact match against the outdated fixture.Solution
make_dummy_get_success()factory function that creates a mock HTTP response from fixture datatest_openaire_v2to usemonkeypatchto mocksession.get(), following the same pattern as the existingtest_api_403_responseThe test now validates that
MetadataFetcher.get_metadata_from_openaire()correctly handles and returns API responses, without depending on external services.Test plan
hatch test- all tests passtest_openaire_v2passes with mocked response