PyTorch
This repository contains the code and experiments for the paper "Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution" (AAAI, 2025 alignment track!) by Eiras-Franco et al., 2025.
Please note that this repository is under active development!
If you find this work interesting or useful in your research, use the following Bibtex annotation to cite us:
@article{gqe2025,
title={Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution},
author={Carlos Eiras-Franco and Anna Hedström and Marina M. -C. Höhne},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39}
year={2025},
}The Quality gap estimator helps determine if a given explanation for a machine learning model prediction for a particular input is better or worse than alternative explanations attending to any quality metric.
For input
The problem
Now you want to evaluate whether
The solution You could sample many alternative explanations to get a sense of what the distribution of the faithfulness value is for different explanations. However, this is very costly, since running FaithfulnessCorrelation is computationally expensive.
QGE allows you to obtain a reliable estimation of how exceptionally faithful (or any other quality metric you may be using)
This project allows for the replication of the experiments in the paper.
Please note that this is only the code for the experiments. To use QGE, please refer to its implementation in the Quantus framework.
Replicating one of the results generally consists of four steps:
- Get the data & train the model (notebooks in
nbs/0-Model training) - Compute the target measures (
src/extract_measures.py) - Compute the results for the paper (
src/2-measure-performance.py,src/get_deltas.py&src/get_qstds.py) - Generate plots if necessary (
nbs/3-plot.ipynb)
Details for each specific result
Section 4.1.a
- Get model & data:
nbs/0-Model training/0-avila-train.ipynb&nbs/0-Model training/0-avila-train.ipynb - Compute measures:
src/1-extract_measures.py(line 90) - Compute results:
src/2-measure-performance.py(line 72) - Figures 2 & 3:
nbs/3-plot.ipynb(uncomment line in first cell)
Section 4.1.b
- Get models & data:
- 20newsgroups:
nbs/0-Model training/0-20newsgroups-train.ipynb - MNIST:
nbs/0-Model training/0-MNIST-train.ipynb - CIFAR:
nbs/0-Model training/0-CIFAR-train.ipynb - Imagenet: Not needed
- 20newsgroups:
- Compute measures:
src/1-extract_measures.py(line 94) - Compute results:
src/2-measure-performance.ipynb(line 76) - Aggregate results:
- Table 1:
src/3-get_deltas.py(line 8)
- Table 1:
Section 4.1.c
- Get models & data:
- ood-mean & ood-zeros:
nbs/0-Model training/0-avila-train-ood.ipynb&nbs/0-Model training/0-avila-train-ood.ipynb - undertrained & untrained:
nbs/0-Model training/0-avila-train.ipynb&nbs/0-Model training/0-avila-train.ipynb- Stop training when test accuracy reaches 70% for undertrained
- Save weights with no training for untrained
- ood-mean & ood-zeros:
- Compute measures:
src/1-extract_measures.py(line 105) - Compute results:
src/2-measure-performance.ipynb(line 91) - Aggregate results:
- Table 2:
src/3-get_deltas.py(line 30) &src/3-get_stds.py
- Table 2:
Section 4.1.d
- Get models & data:
nbs/0-Model training/0-cmnist-train.ipynb - Compute measures:
- Faithfulness:
src/1-extract_measures_only_quantus.py(line 127) - Localization:
src/1-extract_measures_localization-cmnist.py(line 122)
- Faithfulness:
- Compute results:
src/2-measure-performance.ipynb(lines 106 & 115) - Aggregate results:
- Table 3:
src/3-get_deltas.py(line 47) (Pixel-Flipping results taken from Table 1). - Table 4:
src/3-get-deltas.py(line 57)
- Table 3:
Section 4.2
Use meta_evaluation.py in "inverse-estimation" repository.
Appendix
A.1 - Figure 4: After having the results for Section 4.1.a, run nbs/plot-qge-distribution.ipynb
A.2, A.4 - Figures 5 & 6: After having the results for Section 4.1.a, nbs/3-plot.ipynb (uncomment line in first cell)
A.3 - Table 5:
1. Compute measures: src/1-extract_measures.py (line 118)
2. Compute results: src/2-measure-performance.ipynb (line 123)
3. Aggregate results: src/3-get_deltas.py (line 66)
A.5 - Figures 7 & 8: use meta_evaluation.py in "inverse-estimation" repository.
We hope our repository is beneficial to your work and research. If you have any feedback, questions, or ideas, please feel free to raise an issue in this repository. Alternatively, you can reach out to us directly via email for more in-depth discussions or suggestions.
📧 Contact us:
- Carlos Eiras-Franco: carlos.eiras.franco@udc.es
- Anna Hedström: hedstroem.anna@gmail.com
Thank you for your interest and support!
