Skip to content

Feat/merge (resources)#16

Draft
AlexAxthelm wants to merge 7 commits intomainfrom
feat/merge
Draft

Feat/merge (resources)#16
AlexAxthelm wants to merge 7 commits intomainfrom
feat/merge

Conversation

@AlexAxthelm
Copy link
Collaborator

Adds a /resources/merge endpoint to API.

Note: This has not added any automated tests, and exists primarily as a springboard for adding an entity-linkage service.

@AlexAxthelm
Copy link
Collaborator Author

@mbarlow12 I'm assuming you'll want to make a bunch of changes on this. Feel free to take over the branch, but I'd prefer if we keep the signature on the new POST (i.e. takes an array of resource IDs, returns a Resource object).

@AlexAxthelm AlexAxthelm changed the title Feat/merge Feat/merge (resources) Feb 25, 2026
@AlexAxthelm AlexAxthelm marked this pull request as ready for review February 25, 2026 12:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a new /resources/merge endpoint to the API that implements stub functionality for merging multiple resources. The endpoint accepts a list of resource IDs and repoints resources[1:] to resources[0] as the canonical target. This is explicitly noted as a springboard for future entity-linkage service development and intentionally lacks automated tests.

Changes:

  • Added POST /resources/merge endpoint with MergeRequest model for accepting resource IDs
  • Implemented merge_resources function in resource_actions that updates repointed_id field on resources
  • Added seed data (2 resources, 2 sources, 2 memberships) to support merge testing scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File Description
deployments/api/src/stitch/api/routers/resources.py Adds merge endpoint with validation, logging, and error handling wrapper around resource_actions.merge_resources
deployments/api/src/stitch/api/db/resource_actions.py Implements merge_resources function that validates resource existence and updates repointed_id fields
deployments/api/src/stitch/api/db/init_job.py Adds test/seed data for merge scenarios: "Merge Target" and "Merge Consumed" resources with associated sources

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +160 to +162
session: AsyncSession,
user: CurrentUser,
ids: Sequence[int],
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user parameter is accepted but never used in the merge_resources function. If this is intended for future authorization checks or audit logging (e.g., updating last_updated_by_id on modified resources), consider adding a TODO comment. Otherwise, the parameter can be removed to reduce confusion.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbarlow12 my local IDE kept warning me about this too (User isn't consumed in the resource_actions). I'm guessing that when we start actually using permissions to filter the reesources, that's when we'll start using it?

Comment on lines +171 to +178
if not ids:
raise HTTPException(status_code=400, detail="No resource IDs provided.")
# preserve order but drop duplicates
unique_ids = list(dict.fromkeys(ids))
if len(unique_ids) < 2:
raise HTTPException(
status_code=400, detail="Provide at least 2 unique resource IDs."
)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation logic in lines 171-178 duplicates the validation already performed in the router endpoint (lines 58-61 of resources.py). Since the endpoint already validates that there are at least 2 unique resource IDs before calling this function, this redundant validation can be removed. The check for empty IDs list at line 171-172 is also unnecessary since the endpoint guarantees the list is non-empty after deduplication.

Suggested change
if not ids:
raise HTTPException(status_code=400, detail="No resource IDs provided.")
# preserve order but drop duplicates
unique_ids = list(dict.fromkeys(ids))
if len(unique_ids) < 2:
raise HTTPException(
status_code=400, detail="Provide at least 2 unique resource IDs."
)
# preserve order but drop duplicates
unique_ids = list(dict.fromkeys(ids))

Copilot uses AI. Check for mistakes.
except HTTPException:
raise
except Exception as exc:
logger.exception("Error while merging resources %s: %s", unique_ids, exc)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger.exception call already includes the full exception traceback and the exception message automatically. Including the exception in the log message string as well (with %s, exc) results in duplicate exception information in the logs. Consider removing exc from the message format and just using: logger.exception("Error while merging resources %s", unique_ids)

Suggested change
logger.exception("Error while merging resources %s: %s", unique_ids, exc)
logger.exception("Error while merging resources %s", unique_ids)

Copilot uses AI. Check for mistakes.

logger.info(
"Merge requested by user=%s for resource_ids=%s",
getattr(user, "sub", "<anon>"),
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of getattr(user, "sub", "") is unnecessary. According to the User entity definition (entities.py line 167), the 'sub' field is a required string field (not optional). The user parameter is typed as CurrentUser which is a User entity with sub always present. This can be simplified to just user.sub.

Suggested change
getattr(user, "sub", "<anon>"),
user.sub,

Copilot uses AI. Check for mistakes.
Comment on lines +192 to +199
for rid in other_ids:
model = await session.get(ResourceModel, rid)
if model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{rid}` found.",
)
model.repointed_id = target_id
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge_resources function modifies multiple resource records (lines 192-199) but doesn't wrap them in a database transaction beyond the implicit session transaction. If one of the resource IDs doesn't exist, the function will raise an HTTPException after potentially having already modified some resources. Consider checking all resource existence upfront before making any modifications, or ensure the transaction semantics are clear. The current implementation fetches and validates each resource sequentially, which could lead to partial modifications if a later resource is not found.

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +79
@router.post("/merge", response_model=Resource)
async def merge_resources_endpoint(
*, uow: UnitOfWorkDep, user: CurrentUser, payload: MergeRequest
) -> Resource:
"""
Merge multiple resources into one (STUB):
repoint resource_ids[1:] -> resource_ids[0]
"""
ids = payload.resource_ids
# preserve order but drop duplicates
unique_ids = list(dict.fromkeys(ids))
if len(unique_ids) < 2:
raise HTTPException(
status_code=400, detail="Provide at least 2 unique resource IDs"
)

logger.info(
"Merge requested by user=%s for resource_ids=%s",
getattr(user, "sub", "<anon>"),
unique_ids,
)

try:
return await resource_actions.merge_resources(
session=uow.session,
user=user,
ids=unique_ids,
)
except HTTPException:
raise
except Exception as exc:
logger.exception("Error while merging resources %s: %s", unique_ids, exc)
raise HTTPException(status_code=500, detail="Internal error during merge")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new merge endpoint lacks test coverage. The repository has comprehensive test infrastructure for resources endpoints (both unit and integration tests in tests/routers/), and all other endpoints have corresponding tests. Consider adding at least:

  1. Unit tests covering the validation logic (empty list, single ID, duplicate IDs)
  2. Unit tests for error handling (non-existent resource IDs, HTTPException propagation)
  3. Integration tests verifying the actual merge behavior (repointing, target resource returned)

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +198
# Ensure target exists
target_model = await session.get(ResourceModel, target_id)
if target_model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{target_id}` found.",
)

# Ensure all others exist, then repoint them
for rid in other_ids:
model = await session.get(ResourceModel, rid)
if model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{rid}` found.",
)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge logic doesn't check if any of the resources being merged (ids[1:]) already have a repointed_id set. If a resource is already repointed to another resource, this creates a chain of repointing rather than updating the entire chain. Consider either:

  1. Preventing merges of already-repointed resources (validate that all resources in ids[1:] have repointed_id=None)
  2. Resolving the full repointing chain before merging
  3. Updating all resources in the chain to point to the new target

Additionally, there's no check to prevent a resource from being merged with itself (if the same ID appears in different positions after deduplication at the router level).

Suggested change
# Ensure target exists
target_model = await session.get(ResourceModel, target_id)
if target_model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{target_id}` found.",
)
# Ensure all others exist, then repoint them
for rid in other_ids:
model = await session.get(ResourceModel, rid)
if model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{rid}` found.",
)
# Ensure target exists and is not already repointed
target_model = await session.get(ResourceModel, target_id)
if target_model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{target_id}` found.",
)
if getattr(target_model, "repointed_id", None) is not None:
raise HTTPException(
status_code=400,
detail=f"Cannot merge into Resource `{target_id}` because it has already been merged/repointed.",
)
# Ensure all others exist and are not already repointed, then repoint them
for rid in other_ids:
# Explicit guard against self-merge, even if IDs were de-duplicated earlier
if rid == target_id:
raise HTTPException(
status_code=400,
detail="Cannot merge a resource into itself.",
)
model = await session.get(ResourceModel, rid)
if model is None:
raise HTTPException(
status_code=HTTP_404_NOT_FOUND,
detail=f"No Resource with id `{rid}` found.",
)
if getattr(model, "repointed_id", None) is not None:
raise HTTPException(
status_code=400,
detail=f"Resource with id `{rid}` has already been merged/repointed.",
)

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +44
class MergeRequest(BaseModel):
resource_ids: list[int]
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MergeRequest model lacks validation constraints. Consider adding:

  1. A minimum length validator to ensure at least 2 IDs are provided (currently only validated in the endpoint logic)
  2. Documentation about the merge behavior (that ids[0] becomes the target)
    This would make the API contract clearer and move validation into the request model where it belongs.

Copilot uses AI. Check for mistakes.
@AlexAxthelm
Copy link
Collaborator Author

@mbarlow12 Copilot brought up a bunch of issues with my implementation (which all seem fair), but I'm probably not the best person to figure out what the "right" answer is. You mentioned in meeting yesterday wanting to use what's already available in stitch-core.

I'm fine if you want to either take over this PR, or close it. From a design perspective, I'd like to have a merge endpoint at /resources/merge, that takes the array payload that we have now, but everything else is totally open from my side.

@AlexAxthelm AlexAxthelm marked this pull request as draft March 2, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants