-
Notifications
You must be signed in to change notification settings - Fork 52
update docs #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update docs #83
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -151,6 +151,11 @@ meaning ignored that it was ignored during feature extraction. | |
| [mstar]: https://huggingface.co/Wangyh/mSTAR | ||
| [musk]: https://huggingface.co/xiangjx/musk | ||
| [plip]: https://github.com/PathologyFoundation/plip | ||
| [TITAN]: https://huggingface.co/MahmoodLab/TITAN | ||
| [COBRA2]: https://huggingface.co/KatherLab/COBRA | ||
| [EAGLE]: https://github.com/KatherLab/EAGLE | ||
| [MADELEINE]: https://huggingface.co/MahmoodLab/madeleine | ||
|
|
||
|
|
||
|
|
||
| ## Doing Cross-Validation on the Data Set | ||
|
|
@@ -252,4 +257,114 @@ Afterwards, the `output_dir` should contain the following files: | |
| as well as the 95% confidence interval for the statistical scores | ||
| for the splits. | ||
| - `roc-curve_isMSIH=yes.svg` and `pr-curve_isMSIH=yes.svg` | ||
| contain the ROC and precision recall curves of the splits. | ||
| contain the ROC and precision recall curves of the splits. | ||
|
|
||
| ## Slide-Level Encoding | ||
| Tile-Level features can be enconded into a single feature per slide, this is useful | ||
| when trying to capture global patterns across whole slides. | ||
|
|
||
| STAMP currently supports the following encoders: | ||
| - [CHIEF][CHIEF_CTRANSPATH] | ||
| - [TITAN] | ||
| - [GIGAPATH] | ||
| - [COBRA2] | ||
| - [EAGLE] | ||
| - [MADELEINE] | ||
|
|
||
| Slide encoders take as input the already extracted tile-level features in the | ||
| preprocessing step. Each encoder accepts only certain extractors and most | ||
| work only on CUDA devices: | ||
|
|
||
| | Encoder | Required Extractor | Compatible Devices | | ||
| |--|--|--| | ||
| | CHIEF | CTRANSPATH, CHIEF-CTRANSPATH | CUDA only | | ||
| | TITAN | CONCH1.5 | CUDA, cpu, mps | ||
| | GIGAPATH | GIGAPATH | CUDA only | ||
| | COBRA2 | CONCH, UNI, VIRCHOW2 or H-OPTIMUS-0 | CUDA only | ||
| | EAGLE | CTRANSPATH, CHIEF-CTRANSPATH | CUDA only | ||
| | MADELEINE | CONCH | CUDA only | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. crazy, I didnt know so many are cuda only? I thought only mamba models were affected Oo
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there are some like CHIEF which I think that should be easy to accept non cuda devices, but as soon as I read in their github installation guides that they were tested on CUDA and they did not work on the pipeline, i just left it like that. Maybe we should add an issue to debug each model on a mac and linux without cuda to see if we can make them compatible. |
||
|
|
||
| As with feature extractors, most of these models require you to request | ||
| access. The following example uses CHIEF, which is available if you installed | ||
| STAMP with `uv sync --all-extras`. The configuration should look like this: | ||
|
|
||
| ```yaml | ||
| # stamp-test-experiment/config.yaml | ||
|
|
||
| slide_encoding: | ||
| # Encoder to use for slide encoding. Possible options are "cobra", | ||
| # "eagle", "titan", "gigapath", "chief", "prism", "madeleine". | ||
| encoder: "chief" | ||
|
|
||
| # Directory to save the output files. | ||
| output_dir: "/path/to/save/files/to" | ||
|
|
||
| # Directory where the extracted features are stored. | ||
| feat_dir: "/path/your/extracted/features/are/stored/in" | ||
|
|
||
| # Device to run slide encoding on ("cpu", "cuda", "cuda:0", etc.) | ||
| device: "cuda" | ||
|
|
||
| # Optional settings: | ||
| # Directory where the aggregated features are stored. Needed for | ||
| # some encoders such as eagle (it requires virchow2 features). | ||
| #agg_feat_dir: "/path/your/aggregated/features/are/stored/in" | ||
|
|
||
| # Add a hash of the entire preprocessing codebase in the feature folder name. | ||
| #generate_hash: True | ||
| ``` | ||
|
|
||
| Don't forget to put in `feat_dir` a path containing, in this case, `ctranspath` or | ||
| `chief-ctranspath` tile-level features. Once everything is set, you can simply run: | ||
|
|
||
| ```sh | ||
| stamp --config stamp-test-experiment/config.yaml encode_slides | ||
| ``` | ||
| The output will be one `.h5` file per slide. | ||
|
|
||
| ## Patient-Level Encoding | ||
| Even though the available encoders are designed for slide-level use, this | ||
| option concatenates the slides of a patient along the x-axis, creating a single | ||
| "virtual" slide that contains two blocks of tissue. The configuration is the same | ||
| except for `slide_table` which is required to link slides with patients. | ||
| ```yaml | ||
| # stamp-test-experiment/config.yaml | ||
|
|
||
| patient_encoding: | ||
| # Encoder to use for patient encoding. Possible options are "cobra", | ||
| # "eagle", "titan", "gigapath", "chief", "prism", "madeleine". | ||
| encoder: "eagle" | ||
|
|
||
| # Directory to save the output files. | ||
| output_dir: "/path/to/save/files/to" | ||
|
|
||
| # Directory where the extracted features are stored. | ||
| feat_dir: "/path/your/extracted/features/are/stored/in" | ||
|
|
||
| # A table (.xlsx or .csv) relating every slide to their feature files. | ||
| # The table must contain at least two columns, one titled "SLIDE", | ||
| # containing the slide ID, and one called "FILENAME", containing the feature file path relative to `feat_dir`. | ||
| slide_table: "/path/of/slide.csv" | ||
|
|
||
| # Device to run slide encoding on ("cpu", "cuda", "cuda:0", etc.) | ||
| device: "cuda" | ||
|
|
||
| # Optional settings: | ||
| patient_label: "PATIENT" | ||
| filename_label: "FILENAME" | ||
|
|
||
| # Directory where the aggregated features are stored. Needed for | ||
| # some encoders such as eagle (it requires virchow2 features). | ||
| #agg_feat_dir: "/path/your/aggregated/features/are/stored/in" | ||
|
|
||
| # Add a hash of the entire preprocessing codebase in the feature folder name. | ||
| #generate_hash: True | ||
| ``` | ||
|
|
||
| Then run: | ||
| ```sh | ||
| stamp --config stamp-test-experiment/config.yaml encode_patients | ||
| ``` | ||
|
|
||
| The output `.h5` features will have the patient's id as name. | ||
Uh oh!
There was an error while loading. Please reload this page.