-
Notifications
You must be signed in to change notification settings - Fork 5
[ENH] Return expanded metadata per matching dataset for POST /datasets endpoint only
#519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR enhances the POST /datasets endpoint to return enriched per-dataset metadata by merging dynamic SPARQL query results with static metadata loaded from a JSON file at startup, introduces a dedicated /subjects handler and models to keep cohort (/query) behavior backward compatible, and adds configuration and validation for the datasets metadata file along with corresponding SPARQL/template and test updates. Sequence diagram for enhanced POST /datasets response buildingsequenceDiagram
actor Client
participant Router_datasets as Router_datasets
participant CRUD as crud
participant Util as util
participant GraphAPI as Graph_API
participant Env as env_settings
Client->>Router_datasets: POST /datasets (QueryModel)
Router_datasets->>CRUD: post_datasets(query)
CRUD->>Util: create_sparql_queries_for_datasets(query)
Util-->>CRUD: phenotypic_query, imaging_query
CRUD->>GraphAPI: POST phenotypic_query
GraphAPI-->>CRUD: phenotypic_results
CRUD->>GraphAPI: POST imaging_query
GraphAPI-->>CRUD: imaging_results
CRUD->>CRUD: combine results
CRUD->>CRUD: compute matching_dataset_sizes
CRUD->>CRUD: compute matching_dataset_imaging_modals_and_pipelines
loop per dataset_uuid in combined_query_results.groupby(dataset)
CRUD->>Env: DATASETS_METADATA.get(prefixed_dataset_uuid, {})
Env-->>CRUD: dataset_static_metadata
CRUD->>CRUD: dataset_dynamic_metadata dict
CRUD->>CRUD: DatasetQueryResponse(**dataset_static_metadata, **dataset_dynamic_metadata)
CRUD->>CRUD: append to response list
end
CRUD-->>Router_datasets: list[DatasetQueryResponse]
Router_datasets-->>Client: 200 OK JSON
Class diagram for updated query and dataset response modelsclassDiagram
class SessionResponse {
+str session
+str session_file_path
+str dataset_uuid
+str dataset_name
+str sub_id
+int age
+str sex
+str diagnosis
+str image_modal
+dict completed_pipelines
}
class CohortQueryResponse {
+str dataset_uuid
+str dataset_name
+str dataset_portal_uri
+int dataset_total_subjects
+bool records_protected
+int num_matching_subjects
+list image_modals
+dict available_pipelines
+list~SessionResponse~ subject_data
+str subject_data
}
class DatasetQueryResponse {
+str dataset_uuid
+str dataset_name
+list~str~ authors
+str homepage
+list~str~ references_and_links
+list~str~ keywords
+str repository_url
+str access_instructions
+str access_type
+str access_email
+str access_link
+int dataset_total_subjects
+bool records_protected
+int num_matching_subjects
+list~str~ image_modals
+dict available_pipelines
}
class SubjectsQueryResponse {
+str dataset_uuid
+list~SessionResponse~ subject_data
+str subject_data
}
class Settings {
+str graph_address
+str graph_db
+int graph_port
+Path datasets_metadata_path
+bool return_agg
+int min_cell_size
+bool auth_enabled
}
class EnvGlobals {
+dict CONTEXT
+dict ALL_VOCABS
+dict DATASETS_METADATA
}
CohortQueryResponse --> SessionResponse : uses
SubjectsQueryResponse --> SessionResponse : uses
DatasetQueryResponse ..> EnvGlobals : metadata_source
CohortQueryResponse ..> Settings : guarded_by_min_cell_size
SubjectsQueryResponse ..> Settings : guarded_by_min_cell_size
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
…OST /subjects and GET /query
|
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've reviewed this pull request using the Sourcery rules engine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've reviewed this pull request using the Sourcery rules engine
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #519 +/- ##
==========================================
+ Coverage 96.94% 97.23% +0.29%
==========================================
Files 33 33
Lines 1177 1267 +90
==========================================
+ Hits 1141 1232 +91
+ Misses 36 35 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
rmanaem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @alyssadai! I've left couple comments regarding the TODOs in the code. I'd suggest moving them from the code to the issues as we tend to track them more effectively there other than that this is good to go 🧑🍳
Changes proposed in this pull request:
NB_DATASETS_METADATA_PATH(to be hardcoded in the deployment docker-compose.yml) to specify datasets metadata JSON, and load file on app startup/datasetsresponse, retrieve static dataset metadata from the datasets metadata JSON file + append to computed dataset attributes/queryand/subjectsendpoints to preserve existing legacy behaviour of/query(see also Officially deprecate legacyGET /queryendpoint forPOST /subjects#520)/subjectsresponse to exclude dataset metadataChecklist
This section is for the PR reviewer
[ENH],[FIX],[REF],[TST],[CI],[MNT],[INF],[MODEL],[DOC]) (see our Contributing Guidelines for more info)skip-release(to be applied by maintainers only)Closes #XXXXFor new features:
For bug fixes:
Summary by Sourcery
Expand dataset metadata returned by the POST /datasets endpoint by combining per-dataset static metadata from a JSON file with dynamic query results, and simplify SPARQL dataset selections to only include dataset and subject identifiers.
New Features:
Enhancements:
Tests:
Summary by Sourcery
Expand dataset metadata returned by the POST /datasets endpoint using a node-level JSON metadata file while keeping subject-level and legacy /query responses backward compatible.
New Features:
Enhancements:
Tests: