Feat/merge (resources) by AlexAxthelm · Pull Request #16 · RMI/stitch

AlexAxthelm · 2026-02-24T13:56:37Z

Adds a /resources/merge endpoint to API.

Note: This has not added any automated tests, and exists primarily as a springboard for adding an entity-linkage service.

AlexAxthelm · 2026-02-24T15:00:30Z

@mbarlow12 I'm assuming you'll want to make a bunch of changes on this. Feel free to take over the branch, but I'd prefer if we keep the signature on the new POST (i.e. takes an array of resource IDs, returns a Resource object).

Copilot

Pull request overview

This pull request adds a new /resources/merge endpoint to the API that implements stub functionality for merging multiple resources. The endpoint accepts a list of resource IDs and repoints resources[1:] to resources[0] as the canonical target. This is explicitly noted as a springboard for future entity-linkage service development and intentionally lacks automated tests.

Changes:

Added POST /resources/merge endpoint with MergeRequest model for accepting resource IDs
Implemented merge_resources function in resource_actions that updates repointed_id field on resources
Added seed data (2 resources, 2 sources, 2 memberships) to support merge testing scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File	Description
deployments/api/src/stitch/api/routers/resources.py	Adds merge endpoint with validation, logging, and error handling wrapper around resource_actions.merge_resources
deployments/api/src/stitch/api/db/resource_actions.py	Implements merge_resources function that validates resource existence and updates repointed_id fields
deployments/api/src/stitch/api/db/init_job.py	Adds test/seed data for merge scenarios: "Merge Target" and "Merge Consumed" resources with associated sources

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deployments/api/src/stitch/api/routers/resources.py

Copilot · 2026-02-25T12:28:42Z

deployments/api/src/stitch/api/db/resource_actions.py

+    session: AsyncSession,
+    user: CurrentUser,
+    ids: Sequence[int],


The user parameter is accepted but never used in the merge_resources function. If this is intended for future authorization checks or audit logging (e.g., updating last_updated_by_id on modified resources), consider adding a TODO comment. Otherwise, the parameter can be removed to reduce confusion.

@mbarlow12 my local IDE kept warning me about this too (User isn't consumed in the resource_actions). I'm guessing that when we start actually using permissions to filter the reesources, that's when we'll start using it?

Copilot · 2026-02-25T12:28:42Z

deployments/api/src/stitch/api/db/resource_actions.py

+    if not ids:
+        raise HTTPException(status_code=400, detail="No resource IDs provided.")
+    # preserve order but drop duplicates
+    unique_ids = list(dict.fromkeys(ids))
+    if len(unique_ids) < 2:
+        raise HTTPException(
+            status_code=400, detail="Provide at least 2 unique resource IDs."
+        )


The validation logic in lines 171-178 duplicates the validation already performed in the router endpoint (lines 58-61 of resources.py). Since the endpoint already validates that there are at least 2 unique resource IDs before calling this function, this redundant validation can be removed. The check for empty IDs list at line 171-172 is also unnecessary since the endpoint guarantees the list is non-empty after deduplication.

Suggested change

if not ids:

raise HTTPException(status_code=400, detail="No resource IDs provided.")

# preserve order but drop duplicates

unique_ids = list(dict.fromkeys(ids))

if len(unique_ids) < 2:

raise HTTPException(

status_code=400, detail="Provide at least 2 unique resource IDs."

)

# preserve order but drop duplicates

unique_ids = list(dict.fromkeys(ids))

Copilot · 2026-02-25T12:28:42Z

deployments/api/src/stitch/api/routers/resources.py

+    except HTTPException:
+        raise
+    except Exception as exc:
+        logger.exception("Error while merging resources %s: %s", unique_ids, exc)


The logger.exception call already includes the full exception traceback and the exception message automatically. Including the exception in the log message string as well (with %s, exc) results in duplicate exception information in the logs. Consider removing exc from the message format and just using: logger.exception("Error while merging resources %s", unique_ids)

Suggested change

logger.exception("Error while merging resources %s: %s", unique_ids, exc)

logger.exception("Error while merging resources %s", unique_ids)

Copilot · 2026-02-25T12:28:43Z

deployments/api/src/stitch/api/routers/resources.py

+
+    logger.info(
+        "Merge requested by user=%s for resource_ids=%s",
+        getattr(user, "sub", "<anon>"),


The use of getattr(user, "sub", "") is unnecessary. According to the User entity definition (entities.py line 167), the 'sub' field is a required string field (not optional). The user parameter is typed as CurrentUser which is a User entity with sub always present. This can be simplified to just user.sub.

Suggested change

getattr(user, "sub", "<anon>"),

user.sub,

deployments/api/src/stitch/api/db/init_job.py

Copilot · 2026-02-25T12:28:43Z

deployments/api/src/stitch/api/db/resource_actions.py

+    for rid in other_ids:
+        model = await session.get(ResourceModel, rid)
+        if model is None:
+            raise HTTPException(
+                status_code=HTTP_404_NOT_FOUND,
+                detail=f"No Resource with id `{rid}` found.",
+            )
+        model.repointed_id = target_id


The merge_resources function modifies multiple resource records (lines 192-199) but doesn't wrap them in a database transaction beyond the implicit session transaction. If one of the resource IDs doesn't exist, the function will raise an HTTPException after potentially having already modified some resources. Consider checking all resource existence upfront before making any modifications, or ensure the transaction semantics are clear. The current implementation fetches and validates each resource sequentially, which could lead to partial modifications if a later resource is not found.

Copilot · 2026-02-25T12:28:43Z

deployments/api/src/stitch/api/routers/resources.py

+@router.post("/merge", response_model=Resource)
+async def merge_resources_endpoint(
+    *, uow: UnitOfWorkDep, user: CurrentUser, payload: MergeRequest
+) -> Resource:
+    """
+    Merge multiple resources into one (STUB):
+    repoint resource_ids[1:] -> resource_ids[0]
+    """
+    ids = payload.resource_ids
+    # preserve order but drop duplicates
+    unique_ids = list(dict.fromkeys(ids))
+    if len(unique_ids) < 2:
+        raise HTTPException(
+            status_code=400, detail="Provide at least 2 unique resource IDs"
+        )
+
+    logger.info(
+        "Merge requested by user=%s for resource_ids=%s",
+        getattr(user, "sub", "<anon>"),
+        unique_ids,
+    )
+
+    try:
+        return await resource_actions.merge_resources(
+            session=uow.session,
+            user=user,
+            ids=unique_ids,
+        )
+    except HTTPException:
+        raise
+    except Exception as exc:
+        logger.exception("Error while merging resources %s: %s", unique_ids, exc)
+        raise HTTPException(status_code=500, detail="Internal error during merge")


The new merge endpoint lacks test coverage. The repository has comprehensive test infrastructure for resources endpoints (both unit and integration tests in tests/routers/), and all other endpoints have corresponding tests. Consider adding at least:

Unit tests covering the validation logic (empty list, single ID, duplicate IDs)

Unit tests for error handling (non-existent resource IDs, HTTPException propagation)

Integration tests verifying the actual merge behavior (repointing, target resource returned)

Copilot · 2026-02-25T12:28:44Z

deployments/api/src/stitch/api/db/resource_actions.py

+    # Ensure target exists
+    target_model = await session.get(ResourceModel, target_id)
+    if target_model is None:
+        raise HTTPException(
+            status_code=HTTP_404_NOT_FOUND,
+            detail=f"No Resource with id `{target_id}` found.",
+        )
+
+    # Ensure all others exist, then repoint them
+    for rid in other_ids:
+        model = await session.get(ResourceModel, rid)
+        if model is None:
+            raise HTTPException(
+                status_code=HTTP_404_NOT_FOUND,
+                detail=f"No Resource with id `{rid}` found.",
+            )


The merge logic doesn't check if any of the resources being merged (ids[1:]) already have a repointed_id set. If a resource is already repointed to another resource, this creates a chain of repointing rather than updating the entire chain. Consider either:

Preventing merges of already-repointed resources (validate that all resources in ids[1:] have repointed_id=None)

Resolving the full repointing chain before merging

Updating all resources in the chain to point to the new target

Additionally, there's no check to prevent a resource from being merged with itself (if the same ID appears in different positions after deduplication at the router level).

Suggested change

# Ensure target exists

target_model = await session.get(ResourceModel, target_id)

if target_model is None:

raise HTTPException(

status_code=HTTP_404_NOT_FOUND,

detail=f"No Resource with id `{target_id}` found.",

)

# Ensure all others exist, then repoint them

for rid in other_ids:

model = await session.get(ResourceModel, rid)

if model is None:

raise HTTPException(

status_code=HTTP_404_NOT_FOUND,

detail=f"No Resource with id `{rid}` found.",

)

# Ensure target exists and is not already repointed

target_model = await session.get(ResourceModel, target_id)

if target_model is None:

raise HTTPException(

status_code=HTTP_404_NOT_FOUND,

detail=f"No Resource with id `{target_id}` found.",

)

if getattr(target_model, "repointed_id", None) is not None:

raise HTTPException(

status_code=400,

detail=f"Cannot merge into Resource `{target_id}` because it has already been merged/repointed.",

)

# Ensure all others exist and are not already repointed, then repoint them

for rid in other_ids:

# Explicit guard against self-merge, even if IDs were de-duplicated earlier

if rid == target_id:

raise HTTPException(

status_code=400,

detail="Cannot merge a resource into itself.",

)

model = await session.get(ResourceModel, rid)

if model is None:

raise HTTPException(

status_code=HTTP_404_NOT_FOUND,

detail=f"No Resource with id `{rid}` found.",

)

if getattr(model, "repointed_id", None) is not None:

raise HTTPException(

status_code=400,

detail=f"Resource with id `{rid}` has already been merged/repointed.",

)

Copilot · 2026-02-25T12:28:44Z

deployments/api/src/stitch/api/routers/resources.py

+class MergeRequest(BaseModel):
+    resource_ids: list[int]


The MergeRequest model lacks validation constraints. Consider adding:

A minimum length validator to ensure at least 2 IDs are provided (currently only validated in the endpoint logic)

Documentation about the merge behavior (that ids[0] becomes the target)
This would make the API contract clearer and move validation into the request model where it belongs.

AlexAxthelm · 2026-02-25T12:52:28Z

@mbarlow12 Copilot brought up a bunch of issues with my implementation (which all seem fair), but I'm probably not the best person to figure out what the "right" answer is. You mentioned in meeting yesterday wanting to use what's already available in stitch-core.

I'm fine if you want to either take over this PR, or close it. From a design perspective, I'd like to have a merge endpoint at /resources/merge, that takes the array payload that we have now, but everything else is totally open from my side.

layer to handle

AlexAxthelm added 5 commits February 24, 2026 13:10

Create resources/merge POST endpoint

9e60ac6

include mergable data in seed data

86a0a9a

Add merge_resources resource action

bc49d49

style: formatting

686a379

linting: unused import

e881008

AlexAxthelm changed the title ~~Feat/merge~~ Feat/merge (resources) Feb 25, 2026

AlexAxthelm marked this pull request as ready for review February 25, 2026 12:22

AlexAxthelm requested review from Copilot and mbarlow12 February 25, 2026 12:22

Copilot started reviewing on behalf of AlexAxthelm February 25, 2026 12:23 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

AlexAxthelm marked this pull request as draft March 2, 2026 11:55

mbarlow12 added 2 commits March 4, 2026 11:58

Merge branch 'main' into feat/merge

b17a6c3

feat: update merge behavior w/ memberhips, add db errors for api/router

7e22186

layer to handle

	logger.exception("Error while merging resources %s: %s", unique_ids, exc)
	logger.exception("Error while merging resources %s", unique_ids)

Conversation

AlexAxthelm commented Feb 24, 2026

Uh oh!

AlexAxthelm commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

AlexAxthelm Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

AlexAxthelm commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants