add patient level features modelling support#89
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds full support for patient-level feature modeling by automatically detecting feature types, providing new patient-level data loaders and dataset classes, and introducing an MLP-based Lightning module for training and deployment.
- Detect feature type (
tileorpatient) and branch training, deployment, and cross-validation pipelines accordingly - Introduce
PatientFeatureDataset,patient_feature_dataloader, andload_patient_level_datain the data module - Add
LitMLPClassifier, updatetrain_categorical_model_,deploy_categorical_model_, and cross-validation to use patient-level logic and tests
Reviewed Changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/stamp/modeling/data.py | Add detect_feature_type, PatientFeatureDataset, patient_feature_dataloader, and patient-level data loading logic |
| src/stamp/modeling/train.py | Branch training on feature type, wire up patient-level loader and MLP classifier |
| src/stamp/modeling/mlp_classifier.py | New MLPClassifier and LitMLPClassifier for patient-level features |
| src/stamp/modeling/deploy.py | Update deployment to detect feature type and handle patient-level predictions |
| tests/random_data.py | Add helpers for creating patient-level feature files and datasets |
| tests/* | Update existing tests and add new integration/unit tests for patient-level modeling |
Comments suppressed due to low confidence (1)
src/stamp/modeling/deploy.py:156
- [nitpick] In the dict comprehension, using
pdfor the loop variable shadows the common pandas alias. Consider renaming it topatient_dataor similar to improve clarity.
}
| Loads PatientData for patient-level features, matching patients in the clinical table | ||
| to feature files in feature_dir named {patient_id}.h5. | ||
| """ | ||
| # TODO: I'm not proud at all of this. Any other alternative for mapping |
There was a problem hiding this comment.
I dont hate it, only other option is a useless slide-table again. Maybe we could make the slide table optional and take it if its there otherwise use this approach..
There was a problem hiding this comment.
Yeah I thought about it but the idea of adding one more table type, even if I add it as an optional file, will confuse people even more.
move training hyperparams and dataloading stuff into a separate config section.
6201e71 to
9d2090b
Compare
ctranspath is not explicitly declared in chief's paper if it can be used for tile-level feature extraction. cheif-ctranspath is now the default feature extractor so people can run all the pipeline, including encoding, without requesting any model access.
Uh oh!
There was an error while loading. Please reload this page.