Kirigami

Kirigami: large convolutional kernels improve deep learning-based RNA secondary structure prediction

Kirigami is a state-of-the-art (SOTA) AI model for RNA secondary structure prediction. On a standardized test set from bpRNA, Kirigami exceeds the performance of other programs like SPOT-RNA, MXfold2, and UFold.

Installation

The easiest way to download and interact with Kirigami is via PyTorch Hub. Simply run

import torch
model = torch.hub.load('marc-harary/kirigami', 'kirigami', pretrained=True)

Usage

For a given FASTA sequence, run

model('GGGGCGAGCUGCAGCCCCAGUGAAUCAAGUGCAGC')
# '.((((........))))..................'

to invoke a convenience __call__ method that embeds the FASTA string and returns a prediction in dot-bracket notation (DBN).

(Re)training

All experiments were performed via PyTorch Lightning. Although the weights of the production model are located at weights/main.ckpt, Kirigami can be retrained with varying hyperparameters. Run

python run.py --help

for an exhaustive list of configurations, displayed via Lightning's CLI. The appropriate configuration files are located in configs.

Data

Data used for training, validation, and testing are taken from the bpRNA database in the form of the standard TR0, VL0, and TS0 datasets used by SPOT-RNA, MXfold2, and UFold. Respectively, these contain 10,814, 1,300, and 1,305 non-redundant structures. The .dbn files located in this repo were generated by scraping the data originally uploaded by the authors of SPOT-RNA. The RNAStrAlign, archiveII, bpRNAnew, and bpRNAnew_mutate datasets, scraped from UFold, are likewise in the data directory.

Name

From Wikipedia:

Kirigami (切り紙) is a variation of origami, the Japanese art of folding paper. In kirigami, the paper is cut as well as being folded, resulting in a three-dimensional design that stands away from the page.

The Kirigami pipeline both folds RNA molecules via a fully convolutional neural network (FCN) and uses Nussinov-style dynamic programming to recursively cut them into subsequences for post-processing.

Name		Name	Last commit message	Last commit date
Latest commit History 335 Commits
configs		configs
data		data
kirigami		kirigami
weights		weights
.gitattributes		.gitattributes
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kirigami

Installation

Usage

(Re)training

Data

Name

About

Uh oh!

Releases 10

Packages

Uh oh!

Languages

License

marc-harary/kirigami

Folders and files

Latest commit

History

Repository files navigation

Kirigami

Installation

Usage

(Re)training

Data

Name

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Languages

Packages