Skip to content

add patient level features modelling support#89

Merged
EzicStar merged 19 commits intomainfrom
dev/encode-model
Jul 22, 2025
Merged

add patient level features modelling support#89
EzicStar merged 19 commits intomainfrom
dev/encode-model

Conversation

@EzicStar
Copy link
Copy Markdown
Contributor

@EzicStar EzicStar commented Jul 14, 2025

@EzicStar EzicStar self-assigned this Jul 14, 2025
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds full support for patient-level feature modeling by automatically detecting feature types, providing new patient-level data loaders and dataset classes, and introducing an MLP-based Lightning module for training and deployment.

  • Detect feature type (tile or patient) and branch training, deployment, and cross-validation pipelines accordingly
  • Introduce PatientFeatureDataset, patient_feature_dataloader, and load_patient_level_data in the data module
  • Add LitMLPClassifier, update train_categorical_model_, deploy_categorical_model_, and cross-validation to use patient-level logic and tests

Reviewed Changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/stamp/modeling/data.py Add detect_feature_type, PatientFeatureDataset, patient_feature_dataloader, and patient-level data loading logic
src/stamp/modeling/train.py Branch training on feature type, wire up patient-level loader and MLP classifier
src/stamp/modeling/mlp_classifier.py New MLPClassifier and LitMLPClassifier for patient-level features
src/stamp/modeling/deploy.py Update deployment to detect feature type and handle patient-level predictions
tests/random_data.py Add helpers for creating patient-level feature files and datasets
tests/* Update existing tests and add new integration/unit tests for patient-level modeling
Comments suppressed due to low confidence (1)

src/stamp/modeling/deploy.py:156

  • [nitpick] In the dict comprehension, using pd for the loop variable shadows the common pandas alias. Consider renaming it to patient_data or similar to improve clarity.
        }

Comment thread src/stamp/modeling/mlp_classifier.py
Comment thread tests/random_data.py
Comment thread getting-started.md
Comment thread src/stamp/encoding/encoder/chief.py Outdated
Comment thread src/stamp/modeling/data.py
Loads PatientData for patient-level features, matching patients in the clinical table
to feature files in feature_dir named {patient_id}.h5.
"""
# TODO: I'm not proud at all of this. Any other alternative for mapping
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont hate it, only other option is a useless slide-table again. Maybe we could make the slide table optional and take it if its there otherwise use this approach..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about it but the idea of adding one more table type, even if I add it as an optional file, will confuse people even more.

Comment thread src/stamp/modeling/data.py
Comment thread src/stamp/modeling/train.py Outdated
Comment thread src/stamp/modeling/train.py Outdated
EzicStar added 3 commits July 16, 2025 16:28
move training hyperparams and dataloading stuff into a separate config section.
EzicStar added 4 commits July 21, 2025 15:12
ctranspath is not explicitly declared in chief's paper if it can be used for tile-level feature extraction.
cheif-ctranspath is now the default feature extractor so people can run all the pipeline, including encoding, without requesting any model access.
@EzicStar EzicStar requested a review from s1787956 July 21, 2025 15:46
@EzicStar EzicStar merged commit 8901746 into main Jul 22, 2025
28 checks passed
@EzicStar EzicStar linked an issue Jul 22, 2025 that may be closed by this pull request
@EzicStar EzicStar deleted the dev/encode-model branch August 20, 2025 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Modeling hyperparameters config file encoding __init__.py unable to parse excel slide table

3 participants