🧬 SlideFlame-Vanilla: A Flamingo-Style Vision-Language Model for Histopathology

License: This repository is licensed under the MIT License.

🚀 Overview

SlideFlame-Vanilla is a Flamingo-inspired [4] vision-language model tailored for digital pathology. It integrates a pretrained language model (BioGPT-Large) with visual context from whole-slide image (WSI) features using gated cross-attention layers.

🧪 Contributions

2.1 Flamingo-style VLM inspired by PRISM and HistoGPT

We implement a vision-language architecture inspired by recent models such as PRISM [1] and HistoGPT [2]. A pretrained language model (BioGPT) [5] is augmented with cross-attention layers to receive context from WSI-derived image features.

2.2 Patch-level MIL using CONCHv1.5 features

Rather than using raw image pixels, we extract patch-level features using the CONCHv1.5 [3] encoder. These are processed in a multiple instance learning (MIL) setup before being passed to the language model.

2.3 Gated cross-attention + decoupled optimization

Learnable gates: We retain the gated cross-attention modules (i.e., attn_gate, ff_gate) from Flamingo. Unlike the original Flamingo implementation, we initialize attn_gate to 0.55, allowing partial vision-language interaction at the start of training.
Custom parameter grouping: Gated parameters are trained with a separate learning rate (gate_lr) using a custom optimizer grouping strategy.

📚 References

📦 Installation

git clone [https://github.com/KatherLab/slideFlame_Vanilla.git]
cd slideFlame_Vanilla
pip install .

Name		Name	Last commit message	Last commit date
Latest commit History 535 Commits
pathoMozhi		pathoMozhi
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TERMS_AND_CONDITIONS.md		TERMS_AND_CONDITIONS.md
environment.yml		environment.yml
requirements-training.txt		requirements-training.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 SlideFlame-Vanilla: A Flamingo-Style Vision-Language Model for Histopathology

🚀 Overview

🧪 Contributions

2.1 Flamingo-style VLM inspired by PRISM and HistoGPT

2.2 Patch-level MIL using CONCHv1.5 features

2.3 Gated cross-attention + decoupled optimization

📚 References

📦 Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧬 SlideFlame-Vanilla: A Flamingo-Style Vision-Language Model for Histopathology

🚀 Overview

🧪 Contributions

2.1 Flamingo-style VLM inspired by PRISM and HistoGPT

2.2 Patch-level MIL using CONCHv1.5 features

2.3 Gated cross-attention + decoupled optimization

📚 References

📦 Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages