Skip to content

Add more retrieval routes for project data#61

Merged
RangerMauve merged 21 commits intomainfrom
feat/more-datatypes
Nov 5, 2025
Merged

Add more retrieval routes for project data#61
RangerMauve merged 21 commits intomainfrom
feat/more-datatypes

Conversation

@RangerMauve
Copy link
Contributor

closes #51

Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we could map our existing schema directly to what is returned via the API? Currently we are allow-listing props it seems, could we block-list instead? What props do we not want to return? Could we use our existing JSON schema definitions for the validation? The current implementation would add some maintenance overhead, because any comapeo-schema change would need a change here to the schemas.js and also the datatype routes. I'd be happy to brainstorm some ideas about this. There are a few tools we can use to hook up JSON schemas + validation and mapping to data returned from routes.

@RangerMauve
Copy link
Contributor Author

Fastify supports ajv, so maybe we could ditch typebox for it and somehow import the schemas in from there? https://fastify.dev/docs/latest/Reference/Validation-and-Serialization/

We could try to fastify.addSchema the schemas directly from the @comapeo/schema/schemas folder.

Translating the existing typebox schemas to JSON schema should be trivial with my local LLM setup.

I agree a blocklist of fields would scale better. I'm honestly not sure what should go there. My gut feeling is to just include the whole doc which is what I did for the new datatypes.

Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By changing all the input schemas to plain JSONSchema (rather than typebox) we loose all the type safety for the inputs. This doesn't seem to bring any particular benefits, but loosing type safety introduces maintenance issues because it's easy to mis-type params or query properties.

For the datatype getters, it seems like you are taking an allow-list approach for each doc type. What properties are you excluding from each data type and what is the reasoning for excluding them?

@RangerMauve
Copy link
Contributor Author

but loosing type safety introduces maintenance issues because it's easy to mis-type params or query properties.

Should I instead keep the typebox types and automatically convert the json schema to typebox? Is there a trick I can do to make the TS types align with the types defined in comapeo-schema/schema/*? I'm guessing I should use VSCode's intellisense to check this.

For the datatype getters, it seems like you are taking an allow-list approach for each doc type.

I just exclude props based on the existing remoteDetectionAlert tests, else it sends the full doc back right now. What props should we be redacting?

@RangerMauve
Copy link
Contributor Author

Another thought I had was to run a command to generate the typebox types for the schemas: https://www.npmjs.com/package/schema2typebox?activeTab=readme

@gmaclennan
Copy link
Member

I think it's fine for the response types to be just JSONSchema - we don't gain a lot from type-safety on the response types. It's the inputs coming from params, querystring, and any request body that are important to typecheck, e.g. as we had before.

for the data type routes, why not use the schema direct from @comapeo/schema as the return type? e.g.

import { dereferencedDocSchemas as docSchemas } from '@comapeo/schema'

//...

const SUPPORTED_DATA_TYPES =
    /** @satisfies {Array<keyof typeof docSchemas>} */ ([
      'observation',
      'track',
      'preset',
      'field',
    ])

  for (const dataType of SUPPORTED_DATA_TYPES) {
    fastify.get(
      `/projects/:projectPublicId/${dataType}`,
      {
        schema: {
          params: Type.Object({
            projectPublicId: BASE32_STRING_32_BYTES,
          }),
          response: {
            200: {
              type: 'object',
              properties: {
                data: {
                  type: 'array',
                  items: docSchemas[dataType],
                }
              }
            },
            '4xx': schemas.errorResponse,
          },
        },
        async preHandler(req) {
          verifyBearerAuth(req)
          await ensureProjectExists(this, req)
        },
      },
      /**
       * @this {FastifyInstance}
       */
      async function (req) {
        const { projectPublicId } = req.params
        const project = await this.comapeo.getProject(projectPublicId)

        return {
          data: (
            await project[dataType].getMany({ includeDeleted: true })
          )
        }
      },
    )
  }

The reason I suggest this is that one of the concerns with adding additional datatypes is the maintenance overhead, so we're wanting to reduce the maintenance needed if we change the schema type for some reason.

Also: we should probably group routes that require auth under a single route plugin with the verifyBearerAuth once, so we don't forget to add it if we add additional routes, but that can be a follow-up.

Going forwards we could look at versioning this by a request header (or route prefix) so that we can introduce changes without breaking existing, but I think for now ok because if we're using @comapeo/schema directly we should only have backwards-compatible changes.

@RangerMauve
Copy link
Contributor Author

One adjustment, Instead of mapping over doctypes I'm going to keep the `addDatatypeGetter function so we can have an optional mapDoc for the use case where observations get their attachments mapped to have URLs (per existing functionality)

@RangerMauve
Copy link
Contributor Author

Getting some nasty typescript errors 😅

image

@RangerMauve
Copy link
Contributor Author

Initial API jsdoc:

/**
   * @template InputType
   * @template OutputType=InputType
   * @param {"track"|"observation"|"preset"} dataType - DataType to pull from
   * @param {import('ajv').JSONSchemaType<OutputType>} responseSchema - Schema for the response data
   * @param {function(InputType, FastifyRequest): OutputType} mapDoc - Add / remove fields
   */
  function addDatatypeGetter(dataType, responseSchema, mapDoc) { 

Trying to call it with the InputType parameter per the example Gregor provided:

  /** @type {typeof addDatatypeGetter<Track>} */
  addDatatypeGetter(
    'track',
    docSchemas.track,
    (doc /** @type {Track} */) => doc,
  )

Desired behavior:

  • addDataTypeGetter takes a generic InputType, OutputType that defaults to InputType
  • Also takes a schema for the OutputType (JSON schema from comapeo/schema)
  • The MapDoc function will be called for docs in the datatype, and either yield the doc
    or add some more fields that conform to the OutputType and the schema

Issues:

  • I think our schemas aren't valid for the AJV JSONSchemaType?
  • Types giving errors:

InputType' could be instantiated with an arbitrary type which could be unrelated to '({ schemaName: "track";
Property 'nullable' is missing in type '{ readonly description: "Must be track";

@gmaclennan
Copy link
Member

This is a tricky thing to type with this API, but this is close maybe? This gives an expected type error for 'literal' because it does not match the type defined in the generic 'foo'.

  const addTrackRoute = /** @type {typeof addDatatypeGetter<"foo">} */ (
    addDatatypeGetter
  )
  addTrackRoute('track', { type: 'string', const: 'foo' }, () => 'literal')

  /**
   * @template {MapeoDoc['schemaName']} TSchemaName
   * @typedef {Extract<MapeoDoc, { schemaName: TSchemaName }>} GetMapeoDoc
   */

  /**
   * @template TOutputType
   * @template {"track"|"observation"|"preset"} [TDataType]
   * @param {TDataType} dataType - DataType to pull from
   * @param {import('ajv').JSONSchemaType<TOutputType>} responseSchema - Schema for the response data
   * @param {(doc: GetMapeoDoc<TDataType>, req: FastifyRequest) => TOutputType} mapDoc - Add / remove fields
   */
  function addDatatypeGetter(dataType, responseSchema, mapDoc) {

However, this doesn't work with, for example, passing the Track MapeoDoc schema and the Track type because the inference of AJV's JSONSchemaType is less than perfect, and does not match our Track type - there are subtle differences. We use json-schema-to-typescript for static generation of types from our json schema.

This is why tools like Typebox exist. In this case you would define the JSON Schema with typebox, and then infer the output type, with no need to pass a generic, for example if the function is defined like this:

  /**
   * @template {import('@sinclair/typebox').TSchema} TSchema
   * @template {"track"|"observation"|"preset"} [TDataType]
   * @param {TDataType} dataType - DataType to pull from
   * @param {TSchema} responseSchema - Schema for the response data
   * @param {(doc: GetMapeoDoc<TDataType>, req: FastifyRequest) => import('@sinclair/typebox').Static<TSchema>} mapDoc - Add / remove fields
   */
  function addDatatypeGetter(dataType, responseSchema, mapDoc) {

Then you can call it without any generic, because it's inferred from the arguments:

addDatatypeGetter('track', Type.Boolean(), () => true) // ok
addDatatypeGetter('track', Type.Boolean(), () => 'hello') // not ok

This doesn't really help us, because we don't have our schema types defined in Typebox, so while we have the JSONSchema and Type for the original objects, we have no way of modifying both the schema and type in an identical way if we are mapping the object.

I'm not sure that a JSON Schema definition for the response is essential. The main thing that we gain with that is faster responses because Fastify uses fast-json-stringify (which uses the JSON Schema definition to create a fast stringify function), however these are not "hot" code paths. The type checking is nice, in that it allows static checking that we are returning what we say we are, but here it's only ensuring that the JSONSchema matches, which we're needing to "force" anyway, and it doesn't really gain us anything. It just means that our tests need to be a bit more thorough to check the return types of each dataType route.

TL;DR on reflection and thinking more about types and the way Fastify uses JSON Schema on routes, I think it's ok to skip type safety on response types, and also ok to not define a JSON Schema for the response (I think schema validation and type checking are much more important on request bodies, URL params, and query strings). I think we can can validate we are returning what we say we are in tests rather than static type checking.

Side-note: I don't think we should pluralize the data type in route. It backs us into a corner when we add a datatype like category (categorys) and it's why we don't use plurals for data types in the API.

@RangerMauve
Copy link
Contributor Author

Two thoughts:

  1. It'd be easy for me to add a build step to the project to generate the typebox declarations. I'm not so sure that the serialization wouldn't be a hot path given Rudo pulls all the data pretty frequenty and the count can only really go up over time.
  2. If we get rid of the response json schema, would setting it to type object and allowAdditionalProperties true be enough? I think the type declaration could also skip the schema param and just be addDatatypeGetter('track', (doc) => doc). TBH I'd like to make the mapDoc function optional too. I think I could get it done with the GetMapeoDoc<T> trick?

I'm leaning more towards option 1 right now.

@RangerMauve
Copy link
Contributor Author

Bleh issue with option1 is it generates TS, not JS. 😅 Might need an extra step to convert it.

@socket-security
Copy link

socket-security bot commented Oct 15, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedschema2typebox@​1.7.88310010088100

View full report

@RangerMauve
Copy link
Contributor Author

Progress:

  • generating typebox declarations (with customizations for observation attachments)
  • Updated types to infer

Problems:

  • schema2typebox is having trouble converting our declaration for tags

[test:typescript] src/datatypes/observation.js(77,9): error TS2322: Type '{ anyOf: ({ type: string; } | { type: string; items: { anyOf: { type: string; }[]; }; })[]; }' is not assignable to type 'TAdditionalProperties'.

I think I'll submit a PR to fix this or do some monkey patching of the schemas

  • Complaints of needing a default value for @template {"track"|"observation"|"preset"} [TDataType]

[test:typescript] src/routes.js(307,59): error TS1005: '=' expected.

Not sure why this is happening, would like help @gmaclennan

  • Getting issues with the MapeoDoc declaration not being supported

Is it due to field mismatches coming from datatype? Might need to try the deref'd schema instead

[test:typescript] Type '{ schemaName: "observation"; lat?: number | undefined; lon?: number | undefined; attachments: Attachment[]; tags: { [k: string]: string | number | boolean | (string | number | boolean | null)[] | null; }; metadata?: { manualLocation?: boolean | undefined; position?: Position | undefined; lastSavedPosition?: Position | undefined; positionProvider?: { gpsAvailable?: boolean | undefined; passiveAvailable?: boolean | undefined; locationServicesEnabled: boolean; networkAvailable?: boolean | undefined; } | undefined; } | undefined; presetRef?: { docId: string; versionId: string; } | undefined; docId: string; versionId: string; originalVersionId: string; createdAt: string; updatedAt: string; links: string[]; deleted: boolean; } & { forks: string[]; }' is not assignable to type 'Extract<{ schemaName: "coreOwnership"; authCoreId: string; configCoreId: string; dataCoreId: string; blobCoreId: string; blobIndexCoreId: string; docId: string; versionId: string; originalVersionId: string; createdAt: string; updatedAt: string; links: string[]; deleted: boolean; }, { schemaName: TDataType; }>'.

@RangerMauve RangerMauve marked this pull request as ready for review October 20, 2025 16:06
@RangerMauve
Copy link
Contributor Author

@gmaclennan Regarding route pluralization, the existing routes that rudo was using have an s at the end. Should we remove the s from the new routes and keep it in for backwards compat (remove at next major release) and communicate to Rudo it needs to be changed?

@gmaclennan
Copy link
Member

Yeah I think that's the best way forward, now we're making this a more generalized mapping to CoMapeo docs

@RangerMauve
Copy link
Contributor Author

Issue: urls with DocID are too long, tried to switch to z32, but still too long. fastify v4 seems to lack routerOptions param for setting maxParamLength. Tried to upgrade to v5 but saw lots of breaking changes. Going to scoure v4 docs in case the param exists there somewhere.

@RangerMauve
Copy link
Contributor Author

Got it, it was top level 👍

@RangerMauve
Copy link
Contributor Author

Review request:

  • How are the tests looking? I feel they could be more dry but not sure if it's worth adding abstractions.
  • The observation tests check for stuff like deleted so it feels redundant to add to other datatypes. Should I do it anyways?
  • What do you think of the icon API and fallback logic? I am anticipating this for the new categories code, would it be better to expose the raw data for Rudo?

Copy link
Contributor

@rudokemper rudokemper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @RangerMauve request on Slack, I took a look and left a few comments here. Generally this looks great for our needs.

Seems like this closes #39 in addition to #51?

addDatatypeGetter('field', fieldSchema, (field) => field)

fastify.get(
'/projects/:projectPublicId/remoteDetectionAlerts',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'/projects/:projectPublicId/remoteDetectionAlerts',
'/projects/:projectPublicId/remoteDetectionAlert',

Given the other changes in routes and this comment:

Side-note: I don't think we should pluralize the data type in route. It backs us into a corner when we add a datatype like category (categorys) and it's why we don't use plurals for data types in the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do this in a follow-up, and get this merged for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are folks using this endpoint? It was there already kind of like the /observations endpoint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by @rudokemper and team mainly for writing alerts from analysis of remote sensing data to the database, so that users can view them in the app.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@RangerMauve RangerMauve merged commit a586e22 into main Nov 5, 2025
4 checks passed
@RangerMauve RangerMauve deleted the feat/more-datatypes branch November 5, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add more retrieval routes for project data

3 participants