ChemCluster is an interactive web application for cheminformatics and molecular analysis, focusing on forming and visualizing molecular clusters built using Streamlit, RDKit, and scikit-learn.
Final project for the course Practical Programming in Chemistry — EPFL CH-200
ChemCluster is an interactive cheminformatics platform developed at EPFL in 2025 as part of the Practical Programming in Chemistry course. It is a user-friendly web application designed to explore and analyze chemical structures, either individually via the formation of conformers or as datasets.
This tool enables users to compute key molecular properties, visualize 2D and 3D structures, and perform clustering based on molecular similarity or conformer geometry. It also offers filtering options to help select clusters matching specific physicochemical criteria.
- Upload .sdf, .mol, or .csv files containing SMILES
- Input or draw a single molecule and generate 3D conformers
- Compute key molecular properties (MW, logP, H-bonding, etc.)
- Visualize molecules in 2D (RDKit) and interactively in 3D (Py3Dmol)
- Cluster molecules using PCA + KMeans with silhouette score optimization
- Click points on the PCA plot to inspect molecules and properties
- Overlay and compare 3D cluster centroids for conformers
- Filter clusters based on desired property profiles
- Export results and clusters as .csv files
- Install from PyPI:
pip install chemcluster- Run the app:
chemclusterThis will open the ChemCluster interface in your browser.
git clone https://github.com/erubbia/ChemCluster.git
cd ChemCluster
conda env create -f environment.yml
conda activate chemcluster-env
(chemcluster-env) $ pip install -e .Testing can be done with 'pytest' or 'tox':
(chemcluster-env) $ pytest
# or
(chemcluster-env) $ toxLaunching the app brings you to the main page, where you can select one of two modes:
Single molecule mode:
- Draw and paste SMILES to visualize and cluster conformers
- View and overlay optimized 3D centroid structures
Data set mode:
- Upload a SMILES data set to analyze chemical space
- Perform PCA + KMeans clustering with property-based filters
- Click to view molecules and export clusters
Then, you can select the cluster(s) that you want to export as a .csv file by scrolling to the bottom then clicking "Download Cluster Molecules".




