A significant amount of experimental information about Quantitative Trait Locus (QTL) studies are described in (heterogenous) tables of scientific articles. Briefly, a QTL is a genomic region that correlates with a trait of interest (phenotype). QTM is a command-line tool to retrieve and semantically annotate results obtained from QTL mapping experiments. It takes full-text articles from the Europe PMC repository as input and outputs the extracted QTLs into a relational database (SQLite) and text file (CSV).
- Java 1.7 or later
- Apache Maven 3.x
- SQLite 3.x
- Apache Solr 6.x with domain-specific vocabularies and ontologies (Solr cores):
- Gene Ontology (GO)
- Plant Trait Ontology (TO)
- Phenotypic quality ontology (PATO)
- Solanaceae Phenotype Ontology (SPTO)
- STATistics Ontology (STATO)
- Chemical Entities of Biological Interest (ChEBI)
- access to full-text articles (in XML) from Europe PMC
git clone https://github.com/PBR/QTM.git
cd QTM
mvn install
solr/install_solr.sh
- input:
articles.txtwith PMCIDs (one per line) - output:
qtl.csvandqtl.db(see the database model or Entity-Relationship diagram here)
./QTM articles.txt
./QTM -h
...
USAGE
=====
QTM [-v|-h]
QTM [-o FILE_PREFIX] FILE
ARGUMENTS
=========
FILE List of full-text articles from Europe PMC.
Enter one PMCID per line.
OPTIONS
=======
-o, --output FILE_PREFIX Output files in SQLite/CSV formats.
(default: qtl.{db,csv})
-v, --version Print software version.
-h, --help Print this help message.
Note: The example I/O files are provided in the data directory. In case you don't have Internet access or the Europe PMC API does not work, please copy the articles (.xml) from this directory to the root of this repository.