Add more retrieval routes for project data#61
Conversation
gmaclennan
left a comment
There was a problem hiding this comment.
Is there a way we could map our existing schema directly to what is returned via the API? Currently we are allow-listing props it seems, could we block-list instead? What props do we not want to return? Could we use our existing JSON schema definitions for the validation? The current implementation would add some maintenance overhead, because any comapeo-schema change would need a change here to the schemas.js and also the datatype routes. I'd be happy to brainstorm some ideas about this. There are a few tools we can use to hook up JSON schemas + validation and mapping to data returned from routes.
|
Fastify supports ajv, so maybe we could ditch typebox for it and somehow import the schemas in from there? https://fastify.dev/docs/latest/Reference/Validation-and-Serialization/ We could try to Translating the existing typebox schemas to JSON schema should be trivial with my local LLM setup. I agree a blocklist of fields would scale better. I'm honestly not sure what should go there. My gut feeling is to just include the whole doc which is what I did for the new datatypes. |
gmaclennan
left a comment
There was a problem hiding this comment.
By changing all the input schemas to plain JSONSchema (rather than typebox) we loose all the type safety for the inputs. This doesn't seem to bring any particular benefits, but loosing type safety introduces maintenance issues because it's easy to mis-type params or query properties.
For the datatype getters, it seems like you are taking an allow-list approach for each doc type. What properties are you excluding from each data type and what is the reasoning for excluding them?
Should I instead keep the typebox types and automatically convert the json schema to typebox? Is there a trick I can do to make the TS types align with the types defined in
I just exclude props based on the existing remoteDetectionAlert tests, else it sends the full doc back right now. What props should we be redacting? |
|
Another thought I had was to run a command to generate the typebox types for the schemas: https://www.npmjs.com/package/schema2typebox?activeTab=readme |
This reverts commit b36fc02.
|
I think it's fine for the response types to be just JSONSchema - we don't gain a lot from type-safety on the response types. It's the inputs coming from params, querystring, and any request body that are important to typecheck, e.g. as we had before. for the data type routes, why not use the schema direct from @comapeo/schema as the return type? e.g. import { dereferencedDocSchemas as docSchemas } from '@comapeo/schema'
//...
const SUPPORTED_DATA_TYPES =
/** @satisfies {Array<keyof typeof docSchemas>} */ ([
'observation',
'track',
'preset',
'field',
])
for (const dataType of SUPPORTED_DATA_TYPES) {
fastify.get(
`/projects/:projectPublicId/${dataType}`,
{
schema: {
params: Type.Object({
projectPublicId: BASE32_STRING_32_BYTES,
}),
response: {
200: {
type: 'object',
properties: {
data: {
type: 'array',
items: docSchemas[dataType],
}
}
},
'4xx': schemas.errorResponse,
},
},
async preHandler(req) {
verifyBearerAuth(req)
await ensureProjectExists(this, req)
},
},
/**
* @this {FastifyInstance}
*/
async function (req) {
const { projectPublicId } = req.params
const project = await this.comapeo.getProject(projectPublicId)
return {
data: (
await project[dataType].getMany({ includeDeleted: true })
)
}
},
)
}The reason I suggest this is that one of the concerns with adding additional datatypes is the maintenance overhead, so we're wanting to reduce the maintenance needed if we change the schema type for some reason. Also: we should probably group routes that require auth under a single route plugin with the Going forwards we could look at versioning this by a request header (or route prefix) so that we can introduce changes without breaking existing, but I think for now ok because if we're using @comapeo/schema directly we should only have backwards-compatible changes. |
|
One adjustment, Instead of mapping over doctypes I'm going to keep the `addDatatypeGetter function so we can have an optional mapDoc for the use case where observations get their attachments mapped to have URLs (per existing functionality) |
|
Initial API jsdoc: /**
* @template InputType
* @template OutputType=InputType
* @param {"track"|"observation"|"preset"} dataType - DataType to pull from
* @param {import('ajv').JSONSchemaType<OutputType>} responseSchema - Schema for the response data
* @param {function(InputType, FastifyRequest): OutputType} mapDoc - Add / remove fields
*/
function addDatatypeGetter(dataType, responseSchema, mapDoc) { Trying to call it with the InputType parameter per the example Gregor provided: /** @type {typeof addDatatypeGetter<Track>} */
addDatatypeGetter(
'track',
docSchemas.track,
(doc /** @type {Track} */) => doc,
)Desired behavior:
Issues:
|
|
This is a tricky thing to type with this API, but this is close maybe? This gives an expected type error for const addTrackRoute = /** @type {typeof addDatatypeGetter<"foo">} */ (
addDatatypeGetter
)
addTrackRoute('track', { type: 'string', const: 'foo' }, () => 'literal')
/**
* @template {MapeoDoc['schemaName']} TSchemaName
* @typedef {Extract<MapeoDoc, { schemaName: TSchemaName }>} GetMapeoDoc
*/
/**
* @template TOutputType
* @template {"track"|"observation"|"preset"} [TDataType]
* @param {TDataType} dataType - DataType to pull from
* @param {import('ajv').JSONSchemaType<TOutputType>} responseSchema - Schema for the response data
* @param {(doc: GetMapeoDoc<TDataType>, req: FastifyRequest) => TOutputType} mapDoc - Add / remove fields
*/
function addDatatypeGetter(dataType, responseSchema, mapDoc) {However, this doesn't work with, for example, passing the Track MapeoDoc schema and the Track type because the inference of AJV's JSONSchemaType is less than perfect, and does not match our Track type - there are subtle differences. We use json-schema-to-typescript for static generation of types from our json schema. This is why tools like Typebox exist. In this case you would define the JSON Schema with typebox, and then infer the output type, with no need to pass a generic, for example if the function is defined like this: /**
* @template {import('@sinclair/typebox').TSchema} TSchema
* @template {"track"|"observation"|"preset"} [TDataType]
* @param {TDataType} dataType - DataType to pull from
* @param {TSchema} responseSchema - Schema for the response data
* @param {(doc: GetMapeoDoc<TDataType>, req: FastifyRequest) => import('@sinclair/typebox').Static<TSchema>} mapDoc - Add / remove fields
*/
function addDatatypeGetter(dataType, responseSchema, mapDoc) {Then you can call it without any generic, because it's inferred from the arguments: addDatatypeGetter('track', Type.Boolean(), () => true) // ok
addDatatypeGetter('track', Type.Boolean(), () => 'hello') // not okThis doesn't really help us, because we don't have our schema types defined in Typebox, so while we have the JSONSchema and Type for the original objects, we have no way of modifying both the schema and type in an identical way if we are mapping the object. I'm not sure that a JSON Schema definition for the response is essential. The main thing that we gain with that is faster responses because Fastify uses fast-json-stringify (which uses the JSON Schema definition to create a fast stringify function), however these are not "hot" code paths. The type checking is nice, in that it allows static checking that we are returning what we say we are, but here it's only ensuring that the JSONSchema matches, which we're needing to "force" anyway, and it doesn't really gain us anything. It just means that our tests need to be a bit more thorough to check the return types of each dataType route. TL;DR on reflection and thinking more about types and the way Fastify uses JSON Schema on routes, I think it's ok to skip type safety on response types, and also ok to not define a JSON Schema for the response (I think schema validation and type checking are much more important on request bodies, URL params, and query strings). I think we can can validate we are returning what we say we are in tests rather than static type checking. Side-note: I don't think we should pluralize the data type in route. It backs us into a corner when we add a datatype like |
|
Two thoughts:
I'm leaning more towards option 1 right now. |
|
Bleh issue with option1 is it generates TS, not JS. 😅 Might need an extra step to convert it. |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Progress:
Problems:
I think I'll submit a PR to fix this or do some monkey patching of the schemas
Not sure why this is happening, would like help @gmaclennan
Is it due to field mismatches coming from datatype? Might need to try the deref'd schema instead
|
|
@gmaclennan Regarding route pluralization, the existing routes that rudo was using have an |
|
Yeah I think that's the best way forward, now we're making this a more generalized mapping to CoMapeo docs |
|
Issue: urls with DocID are too long, tried to switch to z32, but still too long. fastify v4 seems to lack routerOptions param for setting |
|
Got it, it was top level 👍 |
|
Review request:
|
rudokemper
left a comment
There was a problem hiding this comment.
Per @RangerMauve request on Slack, I took a look and left a few comments here. Generally this looks great for our needs.
| addDatatypeGetter('field', fieldSchema, (field) => field) | ||
|
|
||
| fastify.get( | ||
| '/projects/:projectPublicId/remoteDetectionAlerts', |
There was a problem hiding this comment.
| '/projects/:projectPublicId/remoteDetectionAlerts', | |
| '/projects/:projectPublicId/remoteDetectionAlert', |
Given the other changes in routes and this comment:
Side-note: I don't think we should pluralize the data type in route. It backs us into a corner when we add a datatype like
category(categorys) and it's why we don't use plurals for data types in the API.
There was a problem hiding this comment.
let's do this in a follow-up, and get this merged for now.
There was a problem hiding this comment.
Are folks using this endpoint? It was there already kind of like the /observations endpoint.
There was a problem hiding this comment.
It's used by @rudokemper and team mainly for writing alerts from analysis of remote sensing data to the database, so that users can view them in the app.
There was a problem hiding this comment.

closes #51