[QUESTION] Why is output form dialect id system different from the ADIDA online interface?

**camel_tools 1.5.2 on mac 14.1.1**

Using the preloaded example sentences in the [ADIDA interface,](https://adida.abudhabi.nyu.edu/#/) for instance: 
"بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم"
I get a score of 95.9% for Beirut
When I try to predict the same sentence using camel_tools, I get a different result. For example, using model26 which I assume is the same as in ADIDA

```
from camel_tools.dialectid import DIDModel26
did = DIDModel26.pretrained()
did.predict(['بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم'])
```

I get the following scores: `[DIDPred(top='ALE', scores={'ALE': 0.2744463749182225, 'ALG': 0.0019964477414507772, 'ALX': 0.0017124356871910278, 'AMM': 0.04793813798943018, ...`

Similarly using model6, I also get different and lower scores than the online interface (but at least dialect is correct).

```
from camel_tools.dialectid import DIDModel6
did = DIDModel6.pretrained()
did.predict(['بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم'])
```

I get the following scores: `[DIDPred(top='BEI', scores={'BEI': 0.5475092868164938, 'CAI': 0.05423997031019218, 'DOH': 0.018378809169102468, 'MSA': 0.003793013408907513, 'RAB': 0.0018751946461352397, 'TUN': 0.37420372564916876})]`



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QUESTION] Why is output form dialect id system different from the ADIDA online interface? #141

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QUESTION] Why is output form dialect id system different from the ADIDA online interface? #141

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions