ReactionFeasibilityModel

Installation

To create the conda environment, run the following commands:

conda create --name rfm python=3.11.8 -y
conda activate rfm

# If using CUDA:
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cu118
pip install dgl==2.2.1+cu118 -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html

# If using CPU:
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
pip install dgl==1.1.2 -f https://data.dgl.ai/wheels/torch-2.3/cpu/repo.html

pip install -e .

pip install pre-commit
pre-commit install

Data preparation

To prepare the training datasets, run the following notebooks under notebooks/created_dataset:

create_positive.ipynb. It removes the atom mapping from the raw USPTO dataset. We call this dataset "positive".
extract_forward_templates.ipynb. It extracts the forward templates from the USPTO dataset.
create_negative_forward.ipynb. It creates the negative reactions by applying the forward templates to reactants from the positive dataset.
create_negative_shuffle.ipynb. It creates the negative reactions by shuffling the reactants from the positive dataset. A product from a positive dataset is assigned with a reactants coming from a similar (in terms of Tanimoto distance) reaction.
merge_files.ipynb. It merges the positive and negative datasets into a single one.

Training

The default configs logs to experiments directory and caches the molecules encodings in processed_graphs. If you want to store the data on other partition, you can create a symlink to the desired location.

To train the model, run the following command:

python -m scripts.train --cfg configs/rfm_train.gin

If you want to use other dataset, you should create a configs/datasets/<your_dataset>.gin file pointing to a *.csv file with reactants product and feasible columns. Then you need to replace "include 'configs/datasets/forward_with_shuffle.gin'" with "include 'configs/datasets/<your_dataset>.gin'" in the configs/rfm_train.gin.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
data/uspto_50k		data/uspto_50k
notebooks		notebooks
rfm		rfm
tests/featurizers		tests/featurizers
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReactionFeasibilityModel

Installation

Data preparation

Training

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReactionFeasibilityModel

Installation

Data preparation

Training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages