TripleXtract

Pipeline for automated extraction of species–gene–trait triples in plants from scientific literature.

TripleXtract uses a dual license to offer the distribution of the software under a proprietary model as well as an open source model.

Pipeline description

Metadata collection: species, gene and trait identifiers; PLAZA orthology information; ...
Triple extract: text mining to identify species-gene-trait triples
Export: filtering and export of collected triples

Installation

Python dependencies

Install into a virtual environment using:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

MySQL database

Create a MySQL database with the schema at data/database/db_schema.sql.

Configuration file

Copy and edit the template: config/template.cfg → config/config.cfg

This file controls which steps run and specifies all input/output paths. Full details are described in the Configuration file wiki.

Usage

To run the full pipeline:

python3 ./main.py ./config/config.cfg

Some options in the config file can be overridden on the command line. For a complete list, run:

python3 ./main.py --help

To execute only selected steps, enable the corresponding flags in config.cfg (all flags = yes runs the entire pipeline). Execution order requirements are documented here.

Output files

Descriptions of generated files—including custom GAF triples, evidence records, and MINI-EX priors—are available in the Output files wiki.

Contact and support

Should you have any questions or suggestions, please send an e-mail to klaas.vandepoele@psb.vib-ugent.be.

Should you encounter a bug, please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
config		config
data		data
export		export
parsers		parsers
plaza		plaza
pubtator		pubtator
tools		tools
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TripleXtract

Pipeline description

Installation

Python dependencies

MySQL database

Configuration file

Usage

Output files

Contact and support

About

Uh oh!

Releases

Packages

Languages

License

VIB-PSB/TripleXtract

Folders and files

Latest commit

History

Repository files navigation

TripleXtract

Pipeline description

Installation

Python dependencies

MySQL database

Configuration file

Usage

Output files

Contact and support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages