This repository brings together five connected examples showing how MATLAB® and RDKit can be combined to accelerate molecular analysis, cheminformatics, and drug discovery.
Each folder contains a self-contained Live Script and supporting files that illustrate a different step in the workflow, from molecular visualization to similarity analysis, GPU-accelerated clustering, and SimBiology-based bioavailability simulation.
Visualize molecular structures from SMILES strings, explore stereochemistry (e.g., D-Glucose, Luseogliflozin), and experiment with 3D protein visualization.
Import molecular datasets (e.g., LogP and LogS values), visualize molecules, and partition datasets based on normalized physicochemical properties.
Compute fingerprints, generate similarity matrices, and analyze the relationship between property-based and structure-based diversity.
Perform large-scale clustering using GPU-enabled similarity computations (Tanimoto coefficient). Enables practical analysis of millions of molecules.
Link cheminformatics with system-level pharmacokinetic modeling in SimBiology. Simulate plasma concentration profiles for potential SGLT2 inhibitors.
Each example is provided as a MATLAB Live Script (.mlx
) that can be run directly:
Visualize and Analyze Molecular Structures.mlx
Import_Visualize_and_Partition_Molecular_Datasets.mlx
Molecular_Similarity_Analysis.mlx
Molecular_Clustering_on_GPU.mlx
Multiple_Molecules_Bioavailibility.mlx
- MATLAB
- Bioinformatics Toolbox™ (optional for 3D visualization)
- Parallel Computing Toolbox™ (required for GPU examples)
- SimBiology Toolbox™ (required for bioavailability simulation)
-
Python environment:
- Install Python and confirm compatibility with your MATLAB release: MATLAB–Python interface.
- Provide the Python executable path in MATLAB.
-
RDKit:
- Install with:
pip install rdkit
- Install with:
-
MATLAB Add-Ons:
- Install required toolboxes (SimBiology, Bioinformatics, Parallel Computing) via Add-On Explorer or MathWorks website.
Run the .mlx
file inside each folder. Follow the inline documentation to step through the example.
Each example includes practice exercises to deepen understanding. Highlights include:
- Visual inspection and similarity quantification.
- Correlation of structure vs. LogP/LogS.
- Exploring structural diversity despite property similarity.
- GPU-accelerated clustering on toxicity datasets.
- Extending SimBiology workflows with SGLT2 inhibition models.
- Wieder, O., et al. Improved Lipophilicity and Aqueous Solubility Prediction with Composite Graph Neural Networks. Molecules 2021, 26, 6185.
- Datasets provided by Prof. Thierry Langer, University of Vienna (CC BY 3.0 AT).
The license is available in the License.txt
file.
Copyright 2025 The MathWorks, Inc.