Skip to content
This repository was archived by the owner on Jan 29, 2024. It is now read-only.
This repository was archived by the owner on Jan 29, 2024. It is now read-only.

Investigate Question-Answering models working on tables #614

@FrancescoCasalegno

Description

@FrancescoCasalegno

Context

  • Traditional transformers-based models for extractive question-answering tasks operate on contexts that are units of texts in natural language, e.g. a sentence or a paragraph.
  • However, in many cases the values of parameters of interest for our neuroscientific applications are contained into tables of articles rather than in the text.
  • For instance, the Wikipedia article on Michaelis constant (here) contains several values for this parameter of interest for us, but they are all in a table and no value is mentioned in the text. In fact this is not an isolated case: it's really hard to find Michaelis constant values in the text of any scientific article!
    Screen Shot 2022-08-18 at 11 13 23
  • There seem to be some models for question-answering that can operate on tabular or text/tabular mixed contexts, like TAPAS.

Actions

  • How should the tables be represented for TAPAS (or another model) to be able to take it in input (html? csv? ...) ?
    Is this format compatible with what we can get out our parsing pipeline for the various formats (arXiv, medRxiv, bioRxiv, PMC, PubMed, ...) when the article contains a table?
  • Can TAPAS take mixed inputs, i.e. contexts containing both text and tables?
  • How does TableQuestionAnsweringPipeline differ from QuestionAnsweringPipeline in 🤗 transformers?
  • Are there any other models a part from TAPAS that support question-answering on tabular contexts?
  • Test TAPAS (or another model) on a sample related to neruoscience to see if it could potentially work on our use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions