You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 29, 2024. It is now read-only.
Traditional transformers-based models for extractive question-answering tasks operate on contexts that are units of texts in natural language, e.g. a sentence or a paragraph.
However, in many cases the values of parameters of interest for our neuroscientific applications are contained into tables of articles rather than in the text.
For instance, the Wikipedia article on Michaelis constant (here) contains several values for this parameter of interest for us, but they are all in a table and no value is mentioned in the text. In fact this is not an isolated case: it's really hard to find Michaelis constant values in the text of any scientific article!
There seem to be some models for question-answering that can operate on tabular or text/tabular mixed contexts, like TAPAS.
Actions
How should the tables be represented for TAPAS (or another model) to be able to take it in input (html? csv? ...) ?
Is this format compatible with what we can get out our parsing pipeline for the various formats (arXiv, medRxiv, bioRxiv, PMC, PubMed, ...) when the article contains a table?
Can TAPAS take mixed inputs, i.e. contexts containing both text and tables?