Bugfix/18 fix kartverket api error on cloud #38

jathavaan · 2025-11-17T18:39:54Z

This pull request refactors the county polygon retrieval logic to add blob storage caching and improve maintainability. The main changes include renaming the county polygon retrieval method, introducing caching of county polygons in blob storage, and updating dependency injection to support the new services.

County polygon retrieval and caching improvements:

The method get_county_wkb_by_id in both the interface and implementation has been renamed to get_county_polygons_by_id for clarity and consistency. [1] [2] [3]
County polygons are now cached in blob storage: when a county's polygons are requested, the service first checks blob storage (counties.parquet). If not present, it fetches from the API, stores the result in blob storage, and returns the data.
Helper methods were added to handle reading from and writing to blob storage, using Parquet files and supporting services for byte and blob operations.

Dependency injection and configuration updates:

The dependency injection container (Containers) has been updated to provide the required blob storage and bytes services to CountyService. [1] [2]
A new config constant COUNTY_FILE_NAME was added for the counties Parquet file.

These changes improve performance by reducing redundant API calls and make the county polygon retrieval logic more robust and maintainable.

Copilot

Pull Request Overview

This pull request refactors county polygon retrieval to add blob storage caching, improving performance by reducing redundant API calls to the Kartverket service. The main changes include renaming the retrieval method for clarity, implementing a caching layer using Parquet files in blob storage, and updating dependency injection to support the new architecture.

Renamed get_county_wkb_by_id to get_county_polygons_by_id for better clarity
Implemented blob storage caching with Parquet files to store county polygon data
Updated dependency injection to provide blob storage and bytes services to CountyService

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/infra/infrastructure/services/county_service.py	Added caching logic with helper methods to read from and write county polygons to blob storage
src/infra/infrastructure/containers.py	Updated CountyService dependency injection to include blob_storage_service and bytes_service
src/config.py	Added COUNTY_FILE_NAME constant and type annotation for RELEASE_FILE_NAME
src/application/contracts/county_service_interface.py	Renamed method from get_county_wkb_by_id to get_county_polygons_by_id
src/presentation/entrypoints/release_pipeline.py	Updated method call to use renamed get_county_polygons_by_id

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/infra/infrastructure/services/county_service.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-17T18:59:10Z

src/infra/infrastructure/services/county_service.py

+    def get_county_polygons_by_id(self, county_id: str, epsg_code: EPSGCode) -> tuple[bytes, dict[str, Any]]:
+        geometries = self.__get_county_polygons_from_blob_storage(region=county_id)
+
+        if geometries is not None:
+            return geometries
+
+        wkb, geo_json = self.__fetch_county_polygons_from_api(region=county_id, epsg_code=epsg_code)
+        self.__write_county_to_blob_storage(region=county_id, wkb=wkb, geo_json=geo_json)
+
+        return wkb, geo_json


The caching implementation doesn't account for the epsg_code parameter. When cached data is returned from blob storage (line 40-41), it may be in a different coordinate system than requested. The cache should either:

Store the EPSG code along with the polygon data and check it matches before returning cached data, or

Document that only WGS84 is supported and validate the parameter

This could lead to incorrect spatial operations if the method is ever called with a different EPSG code.

Copilot · 2025-11-17T18:59:11Z

src/infra/infrastructure/services/county_service.py

+            else pd.DataFrame(columns=["region", "wkb", "json"])
+
+        json_string = json.dumps(geo_json)
+        if region in county_df["region"].values:
+            mask = county_df["region"] == region
+            county_df.loc[mask, "wkb"] = wkb
+            county_df.loc[mask, "json"] = json_string
+        else:
+            new_row = pd.DataFrame({"region": [region], "wkb": [wkb], "json": [json_string]})


Assigning bytes directly to DataFrame cells using .loc may not work correctly with all pandas versions. The bytes object might be stored as a reference rather than properly serialized. Consider using .at for scalar assignment or explicitly setting the dtype to object when creating the DataFrame to ensure bytes are stored correctly.

Suggested change

else pd.DataFrame(columns=["region", "wkb", "json"])

json_string = json.dumps(geo_json)

if region in county_df["region"].values:

mask = county_df["region"] == region

county_df.loc[mask, "wkb"] = wkb

county_df.loc[mask, "json"] = json_string

else:

new_row = pd.DataFrame({"region": [region], "wkb": [wkb], "json": [json_string]})

else pd.DataFrame(columns=["region", "wkb", "json"]).astype({"wkb": object})

# Ensure 'wkb' column is dtype object for bytes storage

if "wkb" in county_df.columns:

county_df["wkb"] = county_df["wkb"].astype(object)

json_string = json.dumps(geo_json)

if region in county_df["region"].values:

mask = county_df["region"] == region

county_df.loc[mask, "wkb"] = wkb

county_df.loc[mask, "json"] = json_string

else:

new_row = pd.DataFrame({"region": [region], "wkb": [wkb], "json": [json_string]}).astype({"wkb": object})

Copilot · 2025-11-17T18:59:11Z

src/infra/infrastructure/services/county_service.py

+        county_df = self.__bytes_service.convert_parquet_bytes_to_df(county_bytes)
+        if region not in county_df["region"].values:
+            return None
+
+        row = county_df.loc[county_df["region"] == region].iloc[0]
+        return row["wkb"], json.loads(row["json"])
+


This method lacks error handling for potential failures when parsing the parquet file (line 76) or deserializing JSON (line 81). If the cached file is corrupted or in an unexpected format, the method will raise an unhandled exception instead of falling back to the API. Consider wrapping these operations in a try-except block and returning None on error to allow fallback to the API fetch.

Suggested change

county_df = self.__bytes_service.convert_parquet_bytes_to_df(county_bytes)

if region not in county_df["region"].values:

return None

row = county_df.loc[county_df["region"] == region].iloc[0]

return row["wkb"], json.loads(row["json"])

try:

county_df = self.__bytes_service.convert_parquet_bytes_to_df(county_bytes)

if region not in county_df["region"].values:

return None

row = county_df.loc[county_df["region"] == region].iloc[0]

return row["wkb"], json.loads(row["json"])

except Exception:

return None

jathavaan added 3 commits November 17, 2025 19:36

#18 Created config variable for counties blob

29c6d34

#18 Created services for writing counties parquet file to blob storage

007a77a

#18 Renamed method for fetching county polygons

e8f8fdc

jathavaan self-assigned this Nov 17, 2025

Copilot AI review requested due to automatic review settings November 17, 2025 18:39

jathavaan linked an issue Nov 17, 2025 that may be closed by this pull request

Fix Kartverket API error on cloud #18

Closed

Copilot started reviewing on behalf of jathavaan November 17, 2025 18:40 View session

Copilot finished reviewing on behalf of jathavaan November 17, 2025 18:42

Copilot AI reviewed Nov 17, 2025

View reviewed changes

Update src/infra/infrastructure/services/county_service.py

6fa5ac5

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings November 17, 2025 18:55

jathavaan merged commit 5ce479d into main Nov 17, 2025
2 checks passed

jathavaan deleted the bugfix/18-fix-kartverket-api-error-on-cloud branch November 17, 2025 18:55

Copilot started reviewing on behalf of jathavaan November 17, 2025 18:55 View session

Copilot finished reviewing on behalf of jathavaan November 17, 2025 18:58

Copilot AI reviewed Nov 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix/18 fix kartverket api error on cloud #38

Bugfix/18 fix kartverket api error on cloud #38

Uh oh!

jathavaan commented Nov 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 17, 2025

Uh oh!

Copilot AI Nov 17, 2025

Uh oh!

Copilot AI Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bugfix/18 fix kartverket api error on cloud #38

Bugfix/18 fix kartverket api error on cloud #38

Uh oh!

Conversation

jathavaan commented Nov 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant