A comprehensive database combining:
- π¬ NORMAN compounds (NORMAN-SusDat)
- π CCS values (experimental and predicted)
- β±οΈ Retention times (experimental and predicted)
- π MS2 spectra from MassBank
-
1_NORMAN_and_CCSbase.ipynb:- π Combines NORMAN database with CCS values
- π§ͺ Standardizes SMILES structures
- π Merges experimental and predicted CCS values
-
2_add_RT.ipynb:- β±οΈ Adds predicted retention times
- π€ Uses QSRR model
- π Data cleaning and validation
-
3_add_MS2.ipynb:- π Integrates MS2 spectra from MassBank
- π Filters for ESI-QTOF data
- π§ͺ Matches spectra with compounds
-
4_database_info.ipynb:- π Statistical analysis
- π Data visualization
- π Coverage assessment
- π Python 3.8+
- π¦ Required packages:
- pandas
- rdkit
- QSRR_predictor
- matplotlib
- seaborn
- Clone this repository:
git clone https://github.com/narvall018/NORMAN_CCS_RT_MS2_database.git