Opposing results when comparing to transformers interpret

* ferret version: 0.4.0
* Python version: 3.9.2
* Operating System: Linux Debian

### Description

When comparing your feature attribution scores of the explanation provided by Integrated Gradients (plain) with the ones by the transformers_interpret library (MultiLabelClassificationExplainer), I get significantly different results. For example, a token may have a high score of 0.5 with transformers_interpret, but is negatively attributed with ferret. 
Why could that be ?
Of course, I tested this on the same conditions for both transformers_interpret and ferret (e.g.: pretrained local multi-label BertForSequenceClassification, bert-base-german-cased tokenizer, same sample)

### What I Did


* transformers_interpret:
```
cls_explainer = MultiLabelClassificationExplainer(model, tokenizer, custom_labels=labels)
word_attrib = cls_explainer(<SAMPLE>)
pred = cls_explainer.predicted_class_name
print(word_attrib[pred])
```
* ferret:
```
bench = Benchmark(model, tokenizer)
score = bench.score(<SAMPLE>)
metr = bench.explain(sent, target=target)[4] ### IG (plain) ###
print(metr.scores)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opposing results when comparing to transformers interpret #21

Description

What I Did

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Opposing results when comparing to transformers interpret #21

Description

Description

What I Did

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions