SynDRA: Synonym Mapping for Alignment of Repurposing Therapeutics

SynDRA is a unified drug synonym mapping system designed to harmonize identifiers across major biomedical resources.
It bridges gaps between external drug sources and transcriptomic perturbation datasets such as LINCS/CMap, improving drug repurposing workflows by resolving inconsistent naming.

🌐 Web App

Try SynDRA interactively online: https://tolgacorbaci.shinyapps.io/syndra/

📌 Overview

Integrates synonyms from KatDB, TTD, PRISM, and LINCS2020
Normalizes and deduplicates synonyms into a single mapping
Links identifiers across BRD_IDs, TTD_IDs, and PubChem CIDs
Increases match rates for drug repurposing pipelines (e.g., +8.5% for FO5A benchmark set)

Result:
193,113 unique synonyms mapped to 33,858 BRD_IDs, 2,775 TTD_IDs, and 950 PubChem CIDs.

🔬 Methods

Data Sources

KatDB Synonyms
- Source: Kat Koler
- File: L1000_BRD_name_translated_drug_list.csv
- Purpose: Expand recognition of BRD compounds in L1000 assays
- Example: BRD-K52256627 → “chlorhexidine,” “N-[4-methylpiperazinyl]-…”
Therapeutic Targets Database (TTD)
- URL: TTD Download
- File: P1-04-Drug_synonyms.txt
- Purpose: Drug–target linked synonyms
- Example: D00AAN → “d00aan,” chemical descriptors
PRISM Drug Synonyms
- URL: PRISM GitHub
- File: PRISM_drug_synonyms.csv
- Purpose: Supports MOA enrichment
- Example: PubChem_CID 11314340 → “a-674563”
LINCS 2020 Compound Metadata
- URL: Clue.io Data Dashboard
- File: compoundinfo_beta.txt
- Purpose: Standard compound identifiers for L1000 perturbation studies

Data Characteristics

Initial entry counts
- katdb_df: 13,176
- ttd_df: 299,047
- prism_df: 112,784
Initial unique identifiers
- BROAD_drug_IDs: 5,539
- TTD_drug_IDs: 30,713
- PubChem_CIDs: 1,351
LINCS2020 coverage
- Unique BROAD_drug_IDs: 33,613
- Synonyms: 34,234
LINCS2020 + KatDB combined
- Unique BROAD_drug_IDs: 33,858
- Synonyms: 45,617

Data Preparation

Convert all synonyms to lowercase
Split multi-synonym strings into separate rows
Strip whitespace and formatting artifacts

Merging Strategy

Synonym explosion → one synonym per row
Outer join on synonyms → maximize matches
ID propagation → forward/backward fill with shared synonyms
Grouping & aggregation → unique sets of IDs
Filter → drop rows without BROAD_drug_ID

📊 Results

Final Dataset

Rows: 193,221
Unique synonyms: 193,113
Identifiers:
- BROAD_drug_IDs: 33,858
- TTD_drug_IDs: 2,775
- PubChem_CIDs: 950
Missing values (NaNs):
- BROAD_drug_ID: 0
- synonyms: 0
- TTD_drug_ID: 45,737
- PubChem_CID: 88,237
Duplicate rows: 0

Sample Entries

BROAD_drug_ID	Synonym	TTD_drug_ID	PubChem_CID
BRD-K52256627	chlorhexidine	D0V4GY	9552079
BRD-K52256627	chlorhexidine, combinations	D0V4GY	9552079
BRD-K52256627	1,1'-hexamethylenebis[...]	D0V4GY	9552079

Match Statistics

Category	Initial Entries	Final Entries
BROAD_drug_ID	45,617	33,858
TTD_drug_ID	299,047	2,775
PubChem_CID	112,784	950
Total merged rows	—	435,530
Final unique synonym groups	—	193,221
Matched identifier instances	—	1,145

🚀 Usage

Clone and run the pipeline:

git clone https://github.com/hidelab/SynDRA.git
cd SynDRA
jupyter notebook SynDRA_pipeline.ipynb

📄 License

This project is licensed under the MIT License.

📚 Citation

If you use SynDRA in your work, please cite:

T. Corbaci et. al., SynDRA: Synonym Mapping for Alignment of Repurposing Therapeutics (Brazilian Symposium on Bioinformatics BSB, 2025).

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
synonyms		synonyms
LICENSE		LICENSE
README.md		README.md
SynDRA figure.png		SynDRA figure.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SynDRA: Synonym Mapping for Alignment of Repurposing Therapeutics

🌐 Web App

📌 Overview

🔬 Methods

Data Sources

Data Characteristics

Data Preparation

Merging Strategy

📊 Results

Final Dataset

Sample Entries

Match Statistics

🚀 Usage

📄 License

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SynDRA: Synonym Mapping for Alignment of Repurposing Therapeutics

🌐 Web App

📌 Overview

🔬 Methods

Data Sources

Data Characteristics

Data Preparation

Merging Strategy

📊 Results

Final Dataset

Sample Entries

Match Statistics

🚀 Usage

📄 License

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages