Most turns in the QReCC dataset do not have "truth passages"

Hello, for the rounds where the "truth passage" field is not annotated, is it due to missing annotations, or is there another reason? For example, in the first round of dialogue 1 in the test set:

```json
{
    "Answer_URL": "https://explorehealthcareers.org/career/medicine/physician-assistant/",
    "Context": [],
    "Conversation_no": 1,
    "Conversation_source": "trec",
    "Question": "What is a physician's assistant?",
    "Transformer_rewrite": "What is a physician's assistant",
    "Truth_answer": "physician assistants are medical providers who are licensed to diagnose and treat illness and disease and to prescribe medication for patients",
    "Truth_passages": [],
    "Truth_rewrite": "What is a physician's assistant?",
    "Turn_no": 1
}
```
its truth answer corresponds to the sentence in the paragraph at http://web.archive.org/web/20200106012242id_/https://explorehealthcareers.org/career/medicine/physician-assistant/_p0. However, the test set does not have a truth passage labeled for it.

And, `statis_info` function shows that there are only 29,596 rounds with corresponding gold standard documents, in all 63,501 turns.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Most turns in the QReCC dataset do not have "truth passages" #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Most turns in the QReCC dataset do not have "truth passages" #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions