How to use minChemBio tool

Requirements:

Python = 3.11
Rdkit
Streamlit
Pandas
Numpy
NetworkX
matplotlib
Pulp
CPLEX solver

The files associated with the MinChemBio are seperately attached here, since the github limits uploading files larger than 100 MB without LFS.

How to use minChemBio tool

Getting Started

Create the Conda environment
Set up the environment with all required dependencies by running:
```
conda env create -f minchembio_yml.yml
```
Prepare the dataset
Ensure you have the dataset files in the working directory
Extract reactant and product IDs
Open the provided Jupyter notebook and run it to extract the IDs of reactants and products from S1.txt.
Run the Streamlit web app
Launch the web app interface by executing the following command:
```
   streamlit run minchembio_streamlit.py
```
OR
```
  python run minchembio.py
```
Enter the reactant and product IDs in the app to generate solutions using minChemBio.
Visualize the solutions
Use the visualization Jupyter notebook to explore the generated pathways. Provide the folder containing the pathway outputs as input to the notebook.
Reference data (optional)
The file S2.csv contains a list of USPTO patent IDs and their corresponding reaction SMILES strings. This can be used for further analysis or comparison.

File List

This repository contains the code and datasets required to run minChemBio, a tool for exploring pathways using mixed-integer linear programming (MILP).

Data Files

all_rij_with_miss_cat.json
A dictionary where molecule IDs are the keys. Each value is a dictionary listing all reactions involving that molecule, either as a reactant (-1) or a product (1).
all_sij_with_miss_cat.json
A dictionary where reaction IDs are the keys. Each value is a dictionary listing all molecules involved in the reaction as reactants (-1) or products (1).
bio_chem_smiles_ids_dict_NEW.json
A dictionary with canonical SMILES strings as keys and molecule IDs as values.
bio_chem_smiles_ids_dict_updated.json
A dictionary with canonical SMILES strings as keys and molecule IDs as values. Similar to earlier SMILES-ID mappings but includes fewer molecules.
bio_chem_ids_dict_NEW.json
A dictionary with molecule IDs as keys and canonical SMILES strings as values. Similar to earlier SMILES-ID mappings but includes fewer molecules.
metanetx_metab_db.json A dictionary where MetaNetx molecule IDs are the keys. Each value is a dictionary listing Name, Formula, Charge, Mass, InChI , InchIKey , SMILES , Reference for each molecule. This dataset is extracted from the MetaNetX web platform.
rev_pair_90_nondup.json
A dictionary where reaction IDs are the keys. Each value is a list containing reverse mappings extracted from the same reaction. Used to ensure that forward and reverse reactions do not co-occur in the same pathway.
rxn_classify_with_miss_cat.json
A dictionary with reaction IDs as keys. The value for each key is:
- 1 for chemical reactions
- 0 for biological reactions
S1.txt
A text file containing molecule IDs and their corresponding canonical SMILES strings.
S2.csv
A CSV file containing reaction IDs from the USPTO dataset, along with their associated patent number, year, and reaction SMILES string.

Code Files

minchembio_streamlit.py
A Streamlit web app interface for minChemBio.

Inputs: Product and reactant molecule IDs
Required files:
- all_rij_with_miss_cat.json
- all_sij_with_miss_cat.json
- bio_chem_smiles_ids_dict_updated.json
- rev_pair_90_nondup.json
- rxn_classify_with_miss_cat.json
Output: A text file named in the format
productID_from_reactionID-timestamp_.txt
This file contains all possible solutions (pathways), each being a list of reaction IDs derived from solving the MILP problem.

minchembio.py
A Python script version of the Streamlit app.

Same functionality as minchembio_streamlit.py
Users need to edit the main() function to input the desired molecule IDs.

visualize.ipynb
A Jupyter notebook for visualizing the output pathways.

Inputs: Same as minchembio_streamlit.py + the results file generated from MILP
Output: Visual representations of all identified pathways, saved as .png files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Requirements:

The files associated with the MinChemBio are seperately attached here, since the github limits uploading files larger than 100 MB without LFS.

How to use minChemBio tool

Getting Started

File List

Data Files

Code Files

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
minchembio		minchembio
.DS_Store		.DS_Store
.gitattributes		.gitattributes
All_pathway_images.zip		All_pathway_images.zip
README.md		README.md
README.txt		README.txt
S1.txt		S1.txt
all_rij_with_miss_cat.json		all_rij_with_miss_cat.json
all_sij_with_miss_cat.json		all_sij_with_miss_cat.json
minchembio.py		minchembio.py
minchembio_streamlit.py		minchembio_streamlit.py
results.zip		results.zip
rev_pair_90_nondup.json		rev_pair_90_nondup.json
subScript_new_pords.sh		subScript_new_pords.sh
visualize.ipynb		visualize.ipynb

maranasgroup/chemo-enz

Folders and files

Latest commit

History

Repository files navigation

Requirements:

The files associated with the MinChemBio are seperately attached here, since the github limits uploading files larger than 100 MB without LFS.

How to use minChemBio tool

Getting Started

File List

Data Files

Code Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages