Migrating to a MongoDB Backend #50

brycefrank · 2024-03-10T18:55:09Z

brycefrank
Mar 10, 2024
Maintainer

In discussions with @jwlunsford, we have posited the idea of moving the models backend to MongoDB. Currently the "models backend" is the models.RDS file, stored in the root directory of this repository. I begin briefly summarizing how this backend is maintained/created, then I list out a possible framework for a MongoDB implementation.

Current Process

How the Backend Models are Updated

In short, the current process involves the model-install GitHub action. Every time someone pushes to allometric/models, the model publication files (in ./publications) are ingested using allometric::ingest_models(). This creates a the tibble of models that is ultimately saved as models.RDS.

How the Backend Models are Retrieved by the End User (in R)

When a user wants to retrieve models for their local system, they run allometric::install_models which, more or less, downloads the models.RDS file. The file is loaded by allometric::load_models in its entirety.

MongoDB Process

Instead of using allometric::ingest_models, we instead would run some other function, let's call it ingest_models_json or something similar. The function would (for a rough idea):

How the Backend Models are Updated

Run every publication file in ./publications
For each Publication object, parse the models inside to the JSON representation
Write the JSON representation of the models to the MongoDB.
Also, we can write a JSON representation of the publication citation to a separate collection.

This is still inefficient, but we could at least chase more efficiencies later. For example, if we can detect which publications files were affected by a commit, then we simply load only those files and update or upsert the models matching the publication id. Since nearly all commits just affect one publication, this will vastly reduce the effort of the GitHub action.

How the Backend Models are Retrieved by the End User (in R)

Now that the backend is moved to a MongoDB, retrieving them requires API calls to this database. We need to figure out if this would require a REST API, or if Atlas can handle it on its own. Do users need to authenticate with the API somehow? Is it publicly accessible but with some rate limits implemented? etc.

How do users load models that they want? Do they need to write a query? Are these queries easy to write? etc.

jwlunsford · 2024-03-12T04:27:34Z

jwlunsford
Mar 12, 2024
Collaborator

My first thought is that using Git as a client to update the database when new publications/models are added may be a bad idea. If something changes in Git it will break the entire process. I would suggest using a MongoDB client that can provide CRUD functionality - like Atlas. I'm still unclear on the update process if this is only done by a single person (admin), or if every user has access to update.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrating to a MongoDB Backend #50

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Migrating to a MongoDB Backend #50

Uh oh!

brycefrank Mar 10, 2024 Maintainer

Current Process

MongoDB Process

Replies: 1 comment

Uh oh!

jwlunsford Mar 12, 2024 Collaborator

brycefrank
Mar 10, 2024
Maintainer

jwlunsford
Mar 12, 2024
Collaborator