Skip to content

Add dependency on stitch models for API#29

Closed
AlexAxthelm wants to merge 5 commits intomainfrom
use-stitch-models
Closed

Add dependency on stitch models for API#29
AlexAxthelm wants to merge 5 commits intomainfrom
use-stitch-models

Conversation

@AlexAxthelm
Copy link
Collaborator

@AlexAxthelm AlexAxthelm commented Mar 3, 2026

Updates the data model used in the API deployment to be the workspace package stitch-models, rather than a pydantic model defined in the deployment.

Note that the User object is still defined in the api deployment.

ChatGPT used as explainer while making the migration, and wrote some code snippets.

@AlexAxthelm AlexAxthelm requested review from Copilot and mbarlow12 and removed request for Copilot March 4, 2026 19:57
@AlexAxthelm AlexAxthelm marked this pull request as ready for review March 4, 2026 20:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the API deployment toward using the workspace stitch-models package and simultaneously simplifies the API’s “resources” domain to a domain-agnostic shape (removing source-data handling across the API, DB actions, seeds, and tests).

Changes:

  • Add stitch-models as a workspace dependency for the API deployment.
  • Introduce domain-agnostic request/response models for the resources API and refactor router + DB actions accordingly.
  • Remove source-table modeling/seed logic and update API tests to match the simplified resource model.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
uv.lock Adds stitch-models to the workspace lock dependencies.
deployments/api/pyproject.toml Declares stitch-models as a workspace dependency for the API.
packages/stitch-models/src/stitch/models/__init__.py Exports EmptySourcePayload and a BaseResource type alias for domain-agnostic usage.
deployments/api/src/stitch/api/resources/entities.py Adds new domain-agnostic CreateResource/Resource API schemas.
deployments/api/src/stitch/api/routers/resources.py Switches the resources router to the new domain-agnostic schemas.
deployments/api/src/stitch/api/entities.py Removes the old resource/source-related entities, leaving User.
deployments/api/src/stitch/api/db/resource_actions.py Simplifies create/get/get_all to be domain-agnostic and removes source/membership creation.
deployments/api/src/stitch/api/db/model/sources.py Removes source-table ORM models and related helpers.
deployments/api/src/stitch/api/db/model/resource.py Adjusts membership typing and switches IdType import to stitch-models.
deployments/api/src/stitch/api/db/model/mixins.py Broadens payload typing to BaseModel and changes how source is set from payload.
deployments/api/src/stitch/api/db/model/__init__.py Stops exporting removed source models.
deployments/api/src/stitch/api/db/init_job.py Removes seed sources/memberships and updates dev seeding for simplified resources.
deployments/api/tests/utils.py Simplifies test factories to produce the new minimal CreateResource payload.
deployments/api/tests/routers/test_resources_unit.py Updates unit tests to use the new API schemas and simplified payload validation.
deployments/api/tests/routers/test_resources_integration.py Updates integration tests to match the new response/request shapes (no country/source assertions).
deployments/api/tests/db/test_resource_actions.py Rewrites DB integration tests for the simplified resource actions and adds a get_all test.
deployments/api/tests/conftest.py Removes fixtures tied to removed source models and their defaults.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 5 to 9
from stitch.api.db import resource_actions
from stitch.api.db.config import UnitOfWorkDep
from stitch.api.auth import CurrentUser
from stitch.api.entities import CreateResource, Resource
from stitch.api.resources.entities import CreateResource, Resource

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the API deployment model is migrated to the workspace package stitch-models, but the request/response schemas used by the resources router are still defined locally in stitch.api.resources.entities (only IdType was moved to stitch-models). Either update the PR description to reflect the actual scope, or move/re-export these API models from stitch-models to match the stated goal.

Copilot uses AI. Check for mistakes.
Comment on lines 34 to 38
async def get_all(session: AsyncSession) -> Sequence[Resource]:
stmt = (
select(ResourceModel)
.where(ResourceModel.repointed_id.is_(None))
.options(selectinload(ResourceModel.memberships))
)
stmt = select(ResourceModel).where(ResourceModel.repointed_id.is_(None))
models = (await session.scalars(stmt)).all()
fn = partial(resource_model_to_entity, session)
return await asyncio.gather(*[fn(m) for m in models])
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_all() uses asyncio.gather() to call resource_model_to_entity() concurrently while sharing the same AsyncSession. SQLAlchemy AsyncSession does not support concurrent IO, so this can raise runtime errors (and it also introduces an N+1 query pattern via get_constituents_by_root_id). Prefer iterating sequentially, or refactor to prefetch constituents in bulk without per-resource queries (or use separate sessions per task if you truly need concurrency).

Copilot uses AI. Check for mistakes.
Comment on lines +29 to 33
created=str(model.created) if getattr(model, "created", None) else None,
updated=str(model.updated) if getattr(model, "updated", None) else None,
)


Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource_model_to_entity() passes created/updated into Resource(...), but the Resource model currently has no such fields. This is dead code today (extras are ignored) and will break if Resource is later configured with extra='forbid'. Either add these fields to the Resource schema (if they’re part of the API contract) or stop including them here.

Suggested change
created=str(model.created) if getattr(model, "created", None) else None,
updated=str(model.updated) if getattr(model, "updated", None) else None,
)
)

Copilot uses AI. Check for mistakes.
Comment on lines 45 to 57
def create(
cls,
created_by: UserEntity,
resource: "ResourceModel",
source: SourceKey,
source: str,
source_pk: IdType,
status: MembershipStatus = MembershipStatus.ACTIVE,
):
model = cls(
resource_id=resource.id,
source=source,
source_pk=str(source_pk),
source_pk=int(source_pk),
status=status,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MembershipModel.create() accepts source_pk: IdType (which includes UUID via stitch.models.types.IdType) but then coerces it with int(source_pk) into a BIGINT column. int(UUID) yields a 128-bit integer that will overflow PostgreSQL BIGINT, causing failures at insert time. Either constrain source_pk to int here (and in callers), or change the storage type/serialization to safely handle UUIDs (e.g., store UUID as text).

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +56
# DB ResourceModel may store `.name`; tolerate either while refactor settles.
assert getattr(db_resource, "name", None) in (None, "Test Label")
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assertion is currently tolerant of the resource name not being persisted (getattr(..., 'name', None) in (None, 'Test Label')), which can let real regressions slip through. Since resource_actions.create() sets name, the test should assert the expected persisted value (or assert the specific transitional behavior you actually intend), rather than accepting both outcomes.

Suggested change
# DB ResourceModel may store `.name`; tolerate either while refactor settles.
assert getattr(db_resource, "name", None) in (None, "Test Label")
# DB ResourceModel should persist the resource name set by create().
assert db_resource.name == "Test Label"

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +74


Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EmptySourcePayload is documented as “no sources”, but it inherits Pydantic’s default extra='ignore' behavior. That means callers can pass arbitrary keys (e.g., source data) and it will be silently dropped, which is surprising for an “empty payload”. Consider setting model_config = ConfigDict(extra='forbid') on EmptySourcePayload (or on SourcePayload) to enforce the invariant.

Suggested change
# Enforce that this "empty" payload cannot receive unexpected fields.
model_config: ClassVar[ConfigDict] = ConfigDict(from_attributes=True, extra="forbid")

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +5
from __future__ import annotations

from pydantic import BaseModel, ConfigDict, Field


Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stitch.api.resources is introduced without an __init__.py, which makes it a namespace package. The rest of this repo’s Python package structure (e.g., stitch.api.db, stitch.api.routers) uses explicit packages with __init__.py, so this inconsistency can cause tooling/import edge cases. Consider adding deployments/api/src/stitch/api/resources/__init__.py for consistency.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@mbarlow12 mbarlow12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I entirely understand this PR. It looks like there's a lot of functioning code that's been removed, which isn't necessarily a requirement for integrating stitch-ogsi and/or stitch-models (though I'm pretty sure we won't need a direct dependency on stitch-models).

The basic shapes/pydantic models are all almost identical and can be swapped in with minimal additional changes.

I have some other misc comments, but the big one is really about removing working functionality.

Comment on lines -46 to +40
source: Mapped[SourceKey] = mapped_column(
String(10), nullable=False
) # "gem" | "wm"
source: Mapped[str] = mapped_column(String(10), nullable=False) # "gem" | "wm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does loosen the restrictions somewhat on the valid values, allowing any string under 10 characters.

If an invalid value was somehow saved to the db, we'd get an error when trying to call model_validate when fetching it from the db.

created_by: UserEntity,
resource: "ResourceModel",
source: SourceKey,
source: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a place that helps getting editor support and early errors. With the SrcKey (now OGSISrcKey from stitch-ogsi), you'll get an error diagnostic if you try to call Resource.create(source="some_source", ...).

Comment on lines +71 to +77
class EmptySourcePayload(SourcePayload):
"""Domain-agnostic source payload container (no sources)."""


# A concrete, domain-agnostic Resource you can use everywhere.
# Domains can replace `EmptySourcePayload` with their own payload type.
type BaseResource[TResId: IdType] = Resource[TResId, EmptySourcePayload]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant as it's already captured within the Resource class. This also means that inheritors of BaseResource must subclass EmptySourcePayload specifically. I think it muddies the package API about whether/when to use SourcePayload or EmtpySourcePayload.

Comment on lines +6 to +19
class CreateResource(BaseModel):
"""Domain-agnostic create model."""

model_config = ConfigDict(extra="forbid")
name: str | None = None


class Resource(BaseModel):
"""Domain-agnostic read model."""

id: int
name: str | None = None
repointed_to: int | None = None
constituents: frozenset[int] = Field(default_factory=frozenset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these meant to be temporary placeholders?

Comment on lines +9 to +10
model_config = ConfigDict(extra="forbid")
name: str | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really tight constraint and will reject anything that's not {"name": ...}.

When I originally designed the resource entities, I allowed for a lot of flexibility in the shape of the create request:

  • resource with no source data
  • resource with new source data
  • resource with mix of new and existing (via ids) source data

SourceModelData,
)
from stitch.api.entities import IdType, User as UserEntity
from stitch.models.types import IdType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most (if not all) of the imports will actually end up coming from stitch-ogsi.

Comment on lines -45 to -76
class SourceBase(Base, Generic[TModelIn, TModelOut]):
__abstract__ = True
__entity_class_in__: type[TModelIn]
__entity_class_out__: type[TModelOut]

id: Mapped[int] = mapped_column(
PORTABLE_BIGINT, primary_key=True, autoincrement=True
)

def __init_subclass__(cls, **kwargs) -> None:
super().__init_subclass__(**kwargs)
for base in getattr(cls, "__orig_bases__", ()):
if get_origin(base) is SourceBase:
args = get_args(base)
if len(args) >= 2:
if isinstance(args[0], type):
cls.__entity_class_in__ = args[0]
if isinstance(args[1], type):
cls.__entity_class_out__ = args[1]
break

def as_entity(self):
return self.__entity_class_out__.model_validate(self)

@classmethod
def from_entity(cls, entity: TModelIn) -> Self:
mapper = inspect(cls)
column_keys = {col.key for col in mapper.columns}
filtered = {k: v for k, v in entity.model_dump().items() if k in column_keys}
return cls(**filtered)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly recommend against deleting this file. All these mechanics will work the same with the new models in stitch-ogsi.

Unless we're planning to have the db schema wildly different from the model schema, this will all still function as intended when we integrate stitch-ogsi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, this gets potentially simplified, too. With id as optional now, we don't need 2 classes to distinguish between creational & stored/managed structures.

@AlexAxthelm AlexAxthelm mentioned this pull request Mar 4, 2026
@AlexAxthelm AlexAxthelm closed this Mar 5, 2026
@AlexAxthelm AlexAxthelm deleted the use-stitch-models branch March 5, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants