Migrating to a MongoDB Backend #50
brycefrank
started this conversation in
Ideas
Replies: 1 comment
-
|
My first thought is that using Git as a client to update the database when new publications/models are added may be a bad idea. If something changes in Git it will break the entire process. I would suggest using a MongoDB client that can provide CRUD functionality - like Atlas. I'm still unclear on the update process if this is only done by a single person (admin), or if every user has access to update. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In discussions with @jwlunsford, we have posited the idea of moving the models backend to MongoDB. Currently the "models backend" is the
models.RDSfile, stored in the root directory of this repository. I begin briefly summarizing how this backend is maintained/created, then I list out a possible framework for a MongoDB implementation.Current Process
How the Backend Models are Updated
In short, the current process involves the model-install GitHub action. Every time someone pushes to
allometric/models, the model publication files (in./publications) are ingested usingallometric::ingest_models(). This creates a thetibbleof models that is ultimately saved asmodels.RDS.How the Backend Models are Retrieved by the End User (in R)
When a user wants to retrieve models for their local system, they run
allometric::install_modelswhich, more or less, downloads themodels.RDSfile. The file is loaded byallometric::load_modelsin its entirety.MongoDB Process
Instead of using
allometric::ingest_models, we instead would run some other function, let's call itingest_models_jsonor something similar. The function would (for a rough idea):How the Backend Models are Updated
./publicationsPublicationobject, parse the models inside to the JSON representationThis is still inefficient, but we could at least chase more efficiencies later. For example, if we can detect which publications files were affected by a commit, then we simply load only those files and update or upsert the models matching the publication id. Since nearly all commits just affect one publication, this will vastly reduce the effort of the GitHub action.
How the Backend Models are Retrieved by the End User (in R)
Now that the backend is moved to a MongoDB, retrieving them requires API calls to this database. We need to figure out if this would require a REST API, or if Atlas can handle it on its own. Do users need to authenticate with the API somehow? Is it publicly accessible but with some rate limits implemented? etc.
How do users load models that they want? Do they need to write a query? Are these queries easy to write? etc.
Beta Was this translation helpful? Give feedback.
All reactions