Skip to content

Most turns in the QReCC dataset do not have "truth passages" #2

@lujiarui-iie

Description

@lujiarui-iie

Hello, for the rounds where the "truth passage" field is not annotated, is it due to missing annotations, or is there another reason? For example, in the first round of dialogue 1 in the test set:

{
    "Answer_URL": "https://explorehealthcareers.org/career/medicine/physician-assistant/",
    "Context": [],
    "Conversation_no": 1,
    "Conversation_source": "trec",
    "Question": "What is a physician's assistant?",
    "Transformer_rewrite": "What is a physician's assistant",
    "Truth_answer": "physician assistants are medical providers who are licensed to diagnose and treat illness and disease and to prescribe medication for patients",
    "Truth_passages": [],
    "Truth_rewrite": "What is a physician's assistant?",
    "Turn_no": 1
}

its truth answer corresponds to the sentence in the paragraph at http://web.archive.org/web/20200106012242id_/https://explorehealthcareers.org/career/medicine/physician-assistant/_p0. However, the test set does not have a truth passage labeled for it.

And, statis_info function shows that there are only 29,596 rounds with corresponding gold standard documents, in all 63,501 turns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions