Skip to content

Comments

Adding the GPCRmd Scrapper#34

Merged
pierrepo merged 11 commits intomainfrom
feature/add-gpcrmd-data
Jan 9, 2026
Merged

Adding the GPCRmd Scrapper#34
pierrepo merged 11 commits intomainfrom
feature/add-gpcrmd-data

Conversation

@Essmaw
Copy link
Collaborator

@Essmaw Essmaw commented Dec 24, 2025

Introduce new script for scraping GPCRMD datasets and files. Enhance dataset model to use float for timestep and delta; add simulation_time field.

…dataset model to use float for timestep and delta; add simulation_time field.
…alidation logging, and update function signatures for clarity.
@pierrepo
Copy link
Member

pierrepo commented Jan 2, 2026

Can you please rename BaseDataset as Dataset ?

@pierrepo
Copy link
Member

pierrepo commented Jan 2, 2026

Can you please also make author_names optional with a default as None?

@Essmaw Essmaw added the enhancement New feature or request label Jan 5, 2026
@Essmaw
Copy link
Collaborator Author

Essmaw commented Jan 5, 2026

Hi @pierrepo, could you please review? Thanks!

@Essmaw
Copy link
Collaborator Author

Essmaw commented Jan 8, 2026

Hi @pierrepo, I’ve finished. If it looks good to you, we can merge. Thanks!

Comment on lines 38 to 42
3. Save both the validated and unvalidated dataset datasets to
"data/gpcrmd/gpcrmd_datasets.parquet" and
"data/gpcrmd/not_validated_gpcrmd_datasets.parquet"
4. Save file metadata similarly for validated and unvalidated files.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't store not validated files or datasets into separate parquet files. Print in logs what was wrong with these files and datasets

Comment on lines 182 to 183
timeout : int, optional
Timeout in seconds for the HTTP request (default is 10).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need timeout here

@pierrepo pierrepo self-requested a review January 9, 2026 20:38
@Essmaw
Copy link
Collaborator Author

Essmaw commented Jan 9, 2026

wait !

Copy link
Member

@pierrepo pierrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks @Essmaw

@pierrepo pierrepo merged commit bb03bd2 into main Jan 9, 2026
@pierrepo pierrepo deleted the feature/add-gpcrmd-data branch January 9, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants