Skip to content

Code for "De novo design of insulated cis-regulatory elements based on deep learning-predicted fitness landscape"

License

Notifications You must be signed in to change notification settings

HaochenW/deepinsulated

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepInsulated: De novo design of insulated cis-regulatory elements based on deep learning-predicted fitness landscape

The code for official implementation of "De novo design of insulated cis-regulatory elements based on deep learning-predicted fitness landscape"

This codebase provides:

  1. The implementation of prediction model of insulated dataset.
  2. The implementation of de novo design of insulated promoters.
  3. The implementation of sequence space visualization and kernel visualization.

Contents

  1. Introduction
  2. Environment Setup
  3. Design Promoter Sequence
  4. Bibtex

Introduction

Precise control of gene activity is crucial in bioengineering applications. Despite tremendous efforts in cis-regulatory sequence activity prediction and reverse engineering, the host cellular context-dependent effects have long been neglected, leading to persistent difficulties in accurately modeling the regulatory process. Here, we introduce an insulated design strategy for purifying and modeling host-independent transcriptional activity. By incorporating heterologous paired cis- and trans-regulatory modules into an orthogonal host cell, we have created a controllable transcriptional regulatory system. A deep learning-based algorithm combined with experimental data purification process enabled us to de novo design full-length transcriptional promoter sequences based on host-independent activity landscape.


Remarkably, the landscape accurately depicted the transcriptional activity of the insulated system, which can generate desired sequence- and function-diversified _cis-_regulatory sequences for three different trans-RNA polymerase, and their activities are precisely predictable in both bacterial (Escherichia Coli) and mammalian (Chinese hamster ovary) cell lines. We expected that generative design strategy could be expanded to other complex cis-regulatory elements by combining the deep learning-based algorithm and construction of paired cis- and trans-regulatory modules in an orthogonal host cell.

Environment Setup

Env Requirements:

  • MAC OS, Linux or Windows.
  • Python 3.6.
  • Tensorflow 1.10.0 + keras 2.1.0
  • CUDA 9.0 if you need train deep learning model with gpu.

Steps of using DeepInsulated:

  1. Install Python ref to Download Python

  2. Install the code in virtualenv to keep your environment clean:

    on macOS/Linux
    pip install virtualenv  
    # or 
    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple virtualenv 
    
    virtualenv --python=python3 Deepinsulated
    cd Deepinsulated
    source ./bin/activate
    

    Optional: After use, shutdown virtual environment with

    deactivate
    
    on Windows (open cmd with administrator's permission)
    pip install virtualenv  
    # or 
    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple virtualenv 
    virtualenv --python=python3 Deepinsulated
    cd Deepinsulated
    .\Scripts\activate.bat
    

    Optional: Shutdown virtual environment with

    .\Scripts\deactivate.bat
    
  3. Install Git, this step is optional if you does not install DeepInsulated by git clone. Clone the source codes with git.

    git clone https://github.com/HaochenW/deepinsulated.git
    
  4. or, download the source codes and extract files and put it in the virtual environment directory you defined.

  5. after 2/3, the directory of DeepInsulated should have the following structure:

    DeepInsulated
        deepinsulated
            |-------
            |
            |-------data
                    |...
            |-------prediction
                    |...
            |-------design
                    |...
            |-------visualize
                    |...
            |-------weight
                    |...
            |-------
            |...
        |...
    
  6. After the extraction, download all dependencies with the following commend.

    cd deepinsulated
    pip install -r requirements.txt
    

    To boost download speeds, you could setup pip mirror such as

    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt  
    # or
    pip install -i https://mirrors.aliyun.com/pypi/simple  -r requirements.txt
    
  7. Run DeepInsulated with python and enjoy it with following steps:

Prediction, Design and Visualize Promoter Sequence

1. Training the prediction model (For K1.5)

```
cd prediction
python predict_parallel.py
```

2.1 De novo design the sequences

```
cd design
python de_novo_design.py
```

2.2 Semi-rational design the sequences

```
cd design
python semi_rational_design.py
```

3. Visualize the landscape

```
cd visualize
python kernel_visual_small.py
```
Then Using Matlab to run kernel_visual.m

Bibtex

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows.

@article{,
  title={De novo design of insulated _cis_-regulatory elements based on deep learning-predicted fitness landscape},
  author={Haochen Wang, Yanhui Xiang, Ziming Liu, Wen Yin,  Boyan Li, Long Qian, Xiaowo Wang, Chunbo Lou},
  journal={},
  year={2024}
}

License

For academic use, this project is licensed under the MIT License - see the LICENSE file for details For commercial use, please contact the authors.

About

Code for "De novo design of insulated cis-regulatory elements based on deep learning-predicted fitness landscape"

Resources

License

Stars

Watchers

Forks

Packages

No packages published