-
Notifications
You must be signed in to change notification settings - Fork 100
Description
Hello,
Thank you for sharing this great work. I have some question marks related with experiment 5.2. I'm not clear what is trying to be achieved in this experiment. As far as I understand, you train a base model that is naturally interpretable model(logistic regression or decision tree) to compare with the explainers like LIME, Parzen etc. For each instance, from these base model you get maximum 10 features as "gold features" and you check how many of these gold featueres are recovered by the explainers. If my understanding is correct, I have this question: If we have complex dataset so that logistic regression or decision tree gives a poor performance on the dataset, then are these selected gold features relaible to compare with explainers?