RNA-EFM is a deep learning framework for jointly designing RNA sequences and backbone structures conditioned on a target protein. The method integrates flow matching for geometric alignment and biophysically-informed energy refinement to generate RNA molecules that are both structurally accurate and thermodynamically stable.
The framework leverages recent advances in geometric deep learning and incorporates biophysical constraints, such as Lennard-Jones potential and sequence-derived free energy, into an iterative refinement process guided by an idempotent objective.
- Protein-conditioned RNA co-design: Joint prediction of RNA sequence and 3D backbone structure based on the input protein complex.
- Energy-based refinement: Incorporates physically meaningful constraints (e.g., van der Waals interactions and free energy) during training.
- Flow Matching Objective: Supervises geometric alignment between predicted and native RNA structures via interpolation-based learning.
- Idempotent Refinement: Predictive consistency is enforced by repeatedly applying the structure predictor until convergence.
- Biological Relevance: Designed RNAs exhibit improved thermodynamic stability and better binding affinity with the target protein.
git clone https://github.com/abrarrahmanabir/RNA-EFM.git
cd RNA-EFM
All external dependencies are listed in `environment.yml`. To set up the conda environment:
conda env create -f environment.yml
conda activate rnaefmAdditionally, install the following libraries manually (with CUDA compatibility if needed):
pip install torch-scatter torch-cluster openmmThe dataset used in this study can be downloaded from the following Google Drive link:
👉 RNA-EFM Dataset (Google Drive)
After downloading, extract the contents and place them under the following directory:
rnaefm/data/
This folder will contain all necessary files required to train and evaluate RNA-EFM.
Once the environment is set up and dependencies are installed, you can run inference to generate RNA sequences and structures conditioned on protein backbones:
python scripts/inference.py RNA-EFM will automatically preprocess the input, generate interpolated backbones, and output the designed RNA structures and sequences.
To train RNA-EFM from scratch:
-
Download RF2NA Weights
Download the pre-trained RF2NA weights from: -
Place the checkpoint at:
RoseTTAFold2NA/network/weights/RF2NA_apr23.pt -
Start Training
Run the following command:
python scripts/train.pyThis codebase builds upon and extends the RNAFlow framework. We gratefully acknowledge their contributions and recommend citing their work if you use this repository:
RNAFlow: Protein-Conditioned RNA Structure and Sequence Co-Design. arXiv preprint arXiv:2405.18768, 2024.
https://arxiv.org/pdf/2405.18768
