🧠 Caption-Driven Explainability: Probing CNNs for Bias via CLIP

Multimodal explainable AI framework combining CLIP and CNNs to reveal concept-level bias and interpretability in deep vision models.

Official implementation of the paper:
Patrick Koller¹
Amil Dravid² (also published as Amil V. Dravid)
Guido M. Schuster³
Aggelos K. Katsaggelos¹

¹Northwestern University | ²UC Berkeley | ³Eastern Switzerland University of Applied Sciences
🏔️ Presented at IEEE ICIP 2025, Anchorage (Alaska)

🚀 Overview

Deep neural networks have transformed computer vision, achieving remarkable accuracy in recognition, detection, and classification tasks.
However, understanding why a network makes a specific decision remains one of the central challenges in AI.
This repository introduces a multimodal explainable AI (XAI) framework that bridges vision and language using OpenAI's CLIP.
Through a process called network surgery, it reveals the semantic concepts driving model predictions and exposes hidden biases within learned representations.

💡 Unlike pixel-based saliency methods, our approach:

Explains what concept drives a prediction, not just where the model looked
Identifies spurious correlations such as color or texture bias
Provides quantitative insight into robustness and covariate shift

Conceptual overview: bridging CLIP and a standalone model to uncover the semantics behind decisions.

This repository contains:

✅ Full inference pipeline for caption-driven XAI
✅ CLIP-based probing utilities
✅ Network surgery implementation
✅ Bias visualization assets
✅ Example datasets & scripts

🧩 Core Idea

We integrate a standalone model to be explained (for example ResNet-50) into CLIP by aligning their activation maps.
CLIP’s text encoder then serves as a semantic probe, describing what the model has truly learned.

🔍 Key Components

Network surgery – Swap correlated activation maps between the standalone model and CLIP
Activation matching – Compute cross-layer correlations to identify equivalent feature spaces
Caption-based inference – Use natural-language captions (e.g. “red digit”, “green digit”, “round shape”) to interpret dominant concepts

Activation matching aligns internal feature spaces for interpretable concept fusion.

⚖️ Grad-CAM vs. Caption-Driven XAI

Both Grad-CAM and Caption-Driven XAI offer valuable insights, but they answer different questions.

Method	Explains	Handles overlapping features	Quantitative concept analysis	Human-readable output
Grad-CAM	Spatial importance (where)	❌	❌	❌
Caption-Driven XAI	Conceptual semantics (what)	✅	✅	✅

Grad-CAM highlights the region of attention, while Caption-Driven XAI uncovers the reason, bridging visual focus with linguistic meaning.
Quantitative concept analysis refers to measuring how strongly each linguistic concept (e.g. “red”, “round”) influences a model’s prediction, based on similarity in CLIP’s multimodal embedding space.

📚 Citation

If you use this repository, please cite:

@inproceedings{koller2025captionxai,
  title={Caption-Driven Explainability: Probing CNNs for Bias via CLIP},
  author={Koller, Patrick and Dravid, Amil V. and Schuster, Guido M. and Katsaggelos, Aggelos K.},
  booktitle={IEEE International Conference on Image Processing (ICIP) – Satellite Workshop on Generative AI for World Simulations and Communications},
  year={2025},
  note={Preprint available at arXiv:2510.22035}
}

🌐 Links

📄 arXiv preprint: https://arxiv.org/abs/2510.22035
🧪 Zenodo archive (v1.0.0): https://doi.org/10.5281/zenodo.17546054
👤 Personal website: https://patch0816.github.io
🎓 Google Scholar: https://scholar.google.com/citations?user=jMiy9HQAAAAJ&hl=en

❤️ Acknowledgments

This research was conducted at the AIM-IVPL Lab (Northwestern University),
in collaboration with UC Berkeley and OST/ICAI Switzerland.

Name		Name	Last commit message	Last commit date
Latest commit History 242 Commits
.vscode		.vscode
doc		doc
notebooks		notebooks
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Caption-Driven Explainability: Probing CNNs for Bias via CLIP

🚀 Overview

🧩 Core Idea

🔍 Key Components

⚖️ Grad-CAM vs. Caption-Driven XAI

📚 Citation

🌐 Links

❤️ Acknowledgments

Keywords: Explainable AI, CLIP, Computer Vision, Bias, Robustness, Interpretability, Multimodal Learning, Northwestern University, ICIP 2025

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Caption-Driven Explainability: Probing CNNs for Bias via CLIP

🚀 Overview

🧩 Core Idea

🔍 Key Components

⚖️ Grad-CAM vs. Caption-Driven XAI

📚 Citation

🌐 Links

❤️ Acknowledgments

Keywords: Explainable AI, CLIP, Computer Vision, Bias, Robustness, Interpretability, Multimodal Learning, Northwestern University, ICIP 2025

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages