Skip to content

Feat/entity linkage (STIT-342)#17

Draft
AlexAxthelm wants to merge 11 commits intomainfrom
feat/entity-linkage
Draft

Feat/entity linkage (STIT-342)#17
AlexAxthelm wants to merge 11 commits intomainfrom
feat/entity-linkage

Conversation

@AlexAxthelm
Copy link
Collaborator

No description provided.

@AlexAxthelm AlexAxthelm requested review from Copilot and mbarlow12 and removed request for mbarlow12 February 25, 2026 12:22
@AlexAxthelm AlexAxthelm marked this pull request as ready for review February 25, 2026 12:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an “entity-linkage” deployment container that periodically scans resources via the Stitch API and calls a new stub merge endpoint to repoint duplicate resources.

Changes:

  • Add entity-linkage docker service + config/env defaults.
  • Introduce POST /resources/merge and a stub merge_resources() DB action that repoints resources.
  • Extend dev seed data with duplicate “Merge Demo” resources and supporting source records for demonstration.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
env.example Adds API_URL and entity-linkage tuning variables.
docker-compose.yml Adds the entity-linkage service definition.
deployments/entity-linkage/entity_linkage.py Implements the polling client that GETs resources and POSTs merge requests.
deployments/entity-linkage/README.md Documents the entity-linkage container and env vars.
deployments/entity-linkage/Dockerfile Builds a minimal Python runtime image for entity-linkage.
deployments/api/src/stitch/api/routers/resources.py Adds /resources/merge endpoint wiring to DB action.
deployments/api/src/stitch/api/db/resource_actions.py Adds stub merge logic to repoint resources.
deployments/api/src/stitch/api/db/init_job.py Adds seed records/resources to demonstrate merging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

## Configuration

- `API_URL` (required)
- Example: `http://api:8000`
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README’s API_URL example (http://api:8000) doesn’t match the actual API base path, which is mounted under /api/v1 in stitch.api.main. Using the README example will cause this service to call /resources/ and fail. Update the example to include /api/v1 (or clarify what API_URL should point at).

Suggested change
- Example: `http://api:8000`
- Example: `http://api:8000/api/v1`

Copilot uses AI. Check for mistakes.
Comment on lines +149 to +168
log.info("GET %s", get_url)
r_get = client.get(get_url)

log.info(
"GET response status=%s",
r_get.status_code,
)

if r_get.status_code >= 500:
raise RuntimeError(f"GET failed with status {r_get.status_code}")

try:
data = r_get.json()
except Exception:
log.error("GET did not return valid JSON. Body=%s", r_get.text[:1000])
raise

if not isinstance(data, list):
raise RuntimeError("Expected GET to return a JSON array")

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do_get_then_post() treats any non-5xx GET response as success and then attempts to parse JSON. If the API returns 401/404/etc, this will raise during .json() (or later when checking isinstance(data, list)) and the loop will just keep erroring. Consider explicitly requiring a 2xx response (or using raise_for_status() and handling 4xx with a clear log/skip) before parsing JSON.

Copilot uses AI. Check for mistakes.
r_post.text[:1000],
)

if r_post.status_code >= 500:
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POST responses are only treated as failures for 5xx. For 4xx (e.g., validation errors), the code logs and continues to the next group, which can hide real problems and cause repeated retries/spam. Consider treating any non-2xx as a failure (or at least handling expected 409/400 cases explicitly).

Suggested change
if r_post.status_code >= 500:
if not (200 <= r_post.status_code < 300):

Copilot uses AI. Check for mistakes.
COPY deployments/entity-linkage/entity_linkage.py /app/entity_linkage.py

# Defaults (override via compose/env)
ENV API_URL="http://api:8000" \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image default API_URL is http://api:8000, but the FastAPI app is mounted under /api/v1 (see stitch.api.main where APIRouter(prefix="/api/v1") is used). With the current default, this container will GET /resources/ and fail. Consider changing the default to include /api/v1 (or making the script tolerant by probing both base paths).

Suggested change
ENV API_URL="http://api:8000" \
ENV API_URL="http://api:8000/api/v1" \

Copilot uses AI. Check for mistakes.
@AlexAxthelm AlexAxthelm self-assigned this Feb 25, 2026
@AlexAxthelm AlexAxthelm changed the title Feat/entity linkage Feat/entity linkage (STIT-342) Feb 25, 2026
@AlexAxthelm AlexAxthelm marked this pull request as draft March 2, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants