Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
ab2012d
bug fix for new annotation model
davivcu May 12, 2021
8fbcb83
Merge branch 'master' of https://github.com/davivcu/matilda-dev
davivcu May 12, 2021
9cd2afa
Update login_components.js
davivcu May 13, 2021
01fcd3a
Update conf.json
davivcu May 14, 2021
0c601ed
Delete matilda.conf
davivcu May 14, 2021
b5ed147
Update docker-compose_nginx.yml
davivcu May 15, 2021
ef4b057
Update database.py
davivcu May 15, 2021
7864b56
Merge branch 'master' of https://github.com/davivcu/matilda-dev
davivcu May 15, 2021
ae73e0e
docker revision
davivcu May 15, 2021
e7c1da1
minor fixes
davivcu May 15, 2021
2122557
Update database.py
davivcu May 18, 2021
e092c5f
Update main.css
davivcu May 19, 2021
6c42bd3
Update README.md
davivcu May 19, 2021
fd95869
Create changelog.md
davivcu May 20, 2021
761155a
Update changelog.md
davivcu May 20, 2021
69a2175
exception: annotation model removed from matilda
davivcu May 21, 2021
d4bcc8e
Update all_dialogues.css
davivcu May 24, 2021
b9667ba
Update annotation_app.css
davivcu May 24, 2021
abe12be
Update annotation_app.css
davivcu May 24, 2021
5f520d8
Update languages.js
davivcu May 24, 2021
104540e
Update unipi_model_v2.json
davivcu May 24, 2021
04f5d59
Delete unipi_v2.json
davivcu May 24, 2021
8fadbbc
Update supervision_components.js
davivcu May 25, 2021
5282e08
temporary fix
davivcu May 25, 2021
f33653b
Update matilda_app.py
davivcu May 25, 2021
b715a5e
fix for annotation modal check
davivcu May 25, 2021
0a805e8
bugfix for an empty database update request
davivcu May 25, 2021
51b3b79
datamanagement: shows only the number if more than 3 assigned users
May 26, 2021
8896d98
datamanagement: shows only the number if more than 3 assigned users
davivcu May 31, 2021
e6f630e
Update configuration_components.js
davivcu Jun 10, 2021
482e564
supervision view: editing turn content for all documents
davivcu Jul 3, 2021
a9769bd
Update changelog.md
davivcu Jul 3, 2021
4e26ad4
Update changelog.md
davivcu Jul 3, 2021
81545b4
Update all_dialogues.css
davivcu Jul 27, 2021
8669066
Adding new empty turn, improved request errors messages
davivcu Jul 31, 2021
f96f8e8
small gui fixes
davivcu Aug 2, 2021
8ed4a44
add empty dialogue, fixed a bug on turn content updating
Aug 3, 2021
fafbca4
small fix
davivcu Aug 3, 2021
88e1911
delete turn
Aug 4, 2021
5ac78c2
editing dialogue in datamanagement
davivcu Aug 4, 2021
f97928a
translated "role"
Aug 6, 2021
556af88
Merge branch 'master' of https://github.com/davivcu/matilda-dev
Aug 6, 2021
98e6815
dialogues editing in datamanagement
Aug 6, 2021
9fb5390
Update README.md
davivcu Aug 28, 2021
68dfb94
Update README.md
davivcu Aug 31, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,15 +141,15 @@ You can test it's running by:

`ps aux | grep -v grep | grep mongod`

#### NOTE: Manual operations on the MongoDB database
Whatever solution you choose to use to deploy your MongoDB database, if you perform manual operations on it such as document copies, document deletions or backup recovering, it's very important not to create duplicated documents. This precaution will prevent inconsistent and unexepected behaviour during Matilda's workflow.


### Accessing the interface

Each option you chose before you can now simply navigate to http://localhost:5000 if you installed the server locally
or navigate to the remote server address.
Keep in mind you may need to open the correct ports on your firewall(s) in order to reach the server.

HTTP Requests from your client may not reach your server in some configuration environment,
in those few cases please check and edit the backend address in MATILDA's file `/web/server/gui/source/utils/backend.js`.
Other configuration options are exposed in `/Configuration/conf.json`.

### First username and password
Expand All @@ -166,19 +166,23 @@ All configuration changes that you may wish to make to MATILDA network and datab
There you can change:
- App ports (default 5000) and address (127.0.0.1)
- Database location with address:port combination (127.0.0.1:27017) or mongoDB URI (mongodb://mongo:27017/?retryWrites=true&w=majority)
- The annotation models you want to be available inside MATILDA. The json files you are referring to must be included in the Configuration folder.
- The annotation models you want to be available inside MATILDA. The json files you are referring to, the models, must be included in the Configuration folder.
- Whether or not enforce session security (which is strongly advised) with the session_guard parameter.
- The event logging level saved in `/web/server/matilda.log` file.

If you are using the Docker version you can also perform additional configuration with `/Configuration/gunicorn_run.sh`.
If you are using the Docker version you can also perform additional configuration with `/Configuration/gunicorn_run.sh` in order to set the workers number and other gunicorn options.

### Annotation Models

All configuration changes that you may wish to make to MATILDA's annotation model can be done by editing the json file
`/Configuration/lida_model.json` or by adding a new one. This script contains a configuration dictionary that describes
which labels will appear in MATILDA's front end.
which labels will appear in MATILDA's annotation interface.
You can also add an entire new annotation model file and put a reference to it in the `/Configuration/conf.json` file in
order to instruct the program to load it on start.

You can currently add three different types of new labels to MATILDA:
## 3. Advanced Configuration

Any annotation model has up to four different types of labels in MATILDA:

1. `multilabel_classification` :: will display as checkboxes which you can
select one or more of.
Expand All @@ -191,9 +195,9 @@ You can currently add three different types of new labels to MATILDA:
3. `string` :: will display underneath the user's utterance as a string
response. This is the label field that would be used for a response to the
user's query.


## 3. Advanced Configuration
4. `global_classification_string` :: similar to multilabel_classification_string
but it's not dialogue turn-related, it refers to the entire dialogue.

### New Labels

Expand Down
67 changes: 67 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
![WLUPER AND UNIPI](images/research_collaboration_matilda.png)

# MATILDA: Multi-AnnoTator multi-language Interactive Lightweight Dialogue Annotator

**Authors:** Davide Cucurnia, Nikolai Rozanov, Irene Sucameli, Augusto Ciuffoletti, Maria Simi

**Contact:** contact@wluper.com

**Paper:** [link to the EACL paper](https://www.aclweb.org/anthology/2021.eacl-demos.5/)

### Citation at bottom of README! (Please cite when using)

## 1.5

- Configuration view in Admin Panel
- Supervision annotation rate bars
- Supervision view now allows to upload an already annotated dialogue collection
- Supervision view now allows to edit turn utterances of collections
- Inter-annotator stats for multilabel-string-classification
- Annotation Rate for single dialogue is calculated anew when entering dialogue annotation mode

## 1.4

- Dialogue annotation view displays few annotation customizable options:
- Resizable layout, useful for very large or very small screen layouts.
- Character limit for long utterances: after that number of character a scroll-bar will be shown.
- Auto-save on turn change on/off switch.


## Citation
Please cite these two papers when using.
```
@inproceedings{cucurnia-etal-2021-matilda,
title = "{MATILDA} - Multi-{A}nno{T}ator multi-language {I}nteractive{L}ight-weight Dialogue Annotator",
author = "Cucurnia, Davide and
Rozanov, Nikolai and
Sucameli, Irene and
Ciuffoletti, Augusto and
Simi, Maria",
booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
month = apr,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.eacl-demos.5",
pages = "32--39",
abstract = "Dialogue Systems are becoming ubiquitous in various forms and shapes - virtual assistants(Siri, Alexa, etc.), chat-bots, customer sup-port, chit-chat systems just to name a few.The advances in language models and their publication have democratised advanced NLP.However, data remains a crucial bottleneck.Our contribution to this essential pillar isMATILDA, to the best of our knowledge the first multi-annotator, multi-language dialogue annotation tool. MATILDA allows the creation of corpora, the management of users, the annotation of dialogues, the quick adaptation of the user interface to any language and the resolution of inter-annotator disagreement. We evaluate the tool on ease of use, annotation speed and interannotation resolution for both experts and novices and conclude that this tool not only supports the full pipeline for dialogue annotation, but also allows non-technical people to easily use it. We are completely open-sourcing the tool at https://github.com/wluper/matilda and provide a tutorial video1.",
}
```

```
@inproceedings{collins-etal-2019-lida,
title = "{LIDA}: Lightweight Interactive Dialogue Annotator",
author = "Collins, Edward and
Rozanov, Nikolai and
Zhang, Bingbing",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-3021",
doi = "10.18653/v1/D19-3021",
pages = "121--126",
abstract = "Dialogue systems have the potential to change how people interact with machines but are highly dependent on the quality of the data used to train them.It is therefore important to develop good dialogue annotation tools which can improve the speed and quality of dialogue data annotation. With this in mind, we introduce LIDA, an annotation tool designed specifically for conversation data. As far as we know, LIDA is the first dialogue annotation system that handles the entire dialogue annotation pipeline from raw text, as may be the output of transcription services, to structured conversation data. Furthermore it supports the integration of arbitrary machine learning mod-els as annotation recommenders and also has a dedicated interface to resolve inter-annotator disagreements such as after crowdsourcing an-notations for a dataset. LIDA is fully open source, documented and publicly available.[https://github.com/Wluper/lida] {--}{\textgreater} Screen Cast: https://vimeo.com/329824847",
}
```
9 changes: 4 additions & 5 deletions configuration/conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,17 @@
"lida_model.json",
"unipi_model_v2.json"
],
"docker": false,
"session_guard": true,
"full_log": false
"full_log": true
},
"database": {
"name": "matilda",
"name": "matilda_wsgi",
"legacy_configuration": {
"address": "localhost",
"port": 27017,
"username": null,
"password": null
},
"optional_uri": null
"optional_uri": "mongodb://mongo:27017/?retryWrites=true&w=majority"
}
}
}
2 changes: 1 addition & 1 deletion configuration/gunicorn_run.sh
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#!/bin/sh
cd server; gunicorn --bind 0.0.0.0:5000 matilda_app:MatildaApp --log-file matilda.log --log-level 'info'
cd server; gunicorn --bind 0.0.0.0:5000 matilda_app:MatildaApp --log-file matilda.log --log-level 'info'
147 changes: 66 additions & 81 deletions configuration/unipi_model.json
Original file line number Diff line number Diff line change
@@ -1,83 +1,68 @@
{

"global_slot": {

"description" : "General info related to the dialogue",
"label_type" : "multilabel_global_string",
"required" : false,
"labels" : [
"result"
]
},

"usr": {
"description" : "The user's query",
"label_type" : "string",
"required" : true
},

"sys": {
"description" : "The system's response",
"label_type" : "string",
"required" : true
},

"Dialogue_act": {

"description" : "Type of dialogue act",
"label_type" : "multilabel_classification",
"required" : false,
"labels" :[
"sys_greet",
"sys_inform_basic",
"sys_inform_proactive",
"sys_request",
"sys_select",
"sys_deny",
"usr_greet",
"usr_inform_basic",
"usr_inform_proactive",
"usr_request",
"usr_select",
"usr_deny"
]
},

"Slot": {

"description" : "Entity's value",
"label_type" : "multilabel_classification_string",
"required" : false,
"labels" : [

"job_description",
"contract",
"duties",
"skill",
"past_experience",
"degree",
"age",
"languages",
"area",
"company_name",
"company_size",
"location",
"contact",
"other"
]

},

"Async": {

"description": "To annotate async messages",
"label_type" : "multilabel_classification_string",
"required" : false,
"labels" : [
"turn_ref"

]

}

"Async": {
"description": "To annotate async messages",
"label_type": "multilabel_classification_string",
"labels": [
"turn_ref"
],
"required": false
},
"Dialogue_act": {
"description": "Type of dialogue act",
"label_type": "multilabel_classification",
"labels": [
"sys_greet",
"sys_inform_basic",
"sys_inform_proactive",
"sys_request",
"sys_select",
"sys_deny",
"usr_greet",
"usr_inform_basic",
"usr_inform_proactive",
"usr_request",
"usr_select",
"usr_deny"
],
"required": false
},
"Slot": {
"description": "Entity's value",
"label_type": "multilabel_classification_string",
"labels": [
"job_description",
"contract",
"duties",
"skill",
"past_experience",
"degree",
"age",
"languages",
"area",
"company_name",
"company_size",
"location",
"contact",
"other"
],
"required": false
},
"global_slot": {
"description": "General info related to the dialogue",
"label_type": "multilabel_global_string",
"labels": [
"result"
],
"required": false
},
"sys": {
"description": "The system's response",
"label_type": "string",
"required": true
},
"usr": {
"description": "The user's query",
"label_type": "string",
"required": false
}
}
Loading