Decoding Functional Specialization in GPCRs through Evolution-Guided Residue Profiling - Supplementary Information

This directory contains supplementary information for the manuscript "Decoding Functional Specialization in GPCRs through Evolution-Guided Residue Profiling".

Directory Structure

Supplementary Information/
├── README.md                           # This file
├── Supplementary_Table.xlsx            # Comprehensive data table with all annotations
├── Figures/                            # Manuscript figures
│   ├── Figure1.png                    # Automated ortholog identification workflow
│   ├── Figure2.png                    # Class A, olfactory, and Class T analysis
│   ├── Figure3.png                    # Class B1 and B2 analysis
│   ├── Figure4.png                    # Class F analysis
│   ├── Figure5.png                    # Class C analysis
│   └── Figure6.png                    # Subtype-specific selective residues in ligand and transducer selectivity
└── code_and_data/                     # Computational analysis files
    ├── benchmark/                     # Interactive benchmark visualizations
    ├── ortholog_pipeline/             # Ortholog identification pipeline
    ├── CRs_and_SRs_on_structure/     # Structural visualizations
    └── scatter_plots_and_CR_SR_calculation/  # Scatter plot generation

Supplementary Table

Supplementary_Table.xlsx contains comprehensive data from the GPCR conservation analysis, including:

Data Content

Scatter Plot Information: Conservation percentages and entropy values for all residues
Initial Annotations: Detailed functional annotations assigned during analysis
Simplified Annotations: Streamlined functional categories for visualization
CR/SR Information: Classification of residues as Common (CR) or Selective (SR)
Class-Specific Data: Separate sheets for each GPCR class (A, B1, B2, C, F, T, Olfactory)

Simplified Annotation Categories

Ligand: Residues involved in ligand binding
Transducer: Residues involved in G protein/transducer binding
Known Motifs: Conserved motifs (CWxP, NPxxY, PIF, DRY, Na+ Pocket)
Disulfide Bridge: Cysteine residues forming disulfide bonds
Cholesterol: Cholesterol binding site residues
ICL2: Intracellular loop 2 residues (Class T specific)
Tethered Agonist: Residues involved in tethered agonist binding (Class B2)
VFT Ligand: Venus Flytrap domain ligand binding residues (Class C)
Allosteric Modulator: Residues involved in allosteric modulation
WNT Binding: Residues involved in WNT protein binding (Class F)
Other: Residues without specific functional annotation

Data Organization

Each GPCR class has its own worksheet containing:

Residue numbers (using class-specific reference proteins)
GPCRdb numbers for the given reference receptor
Family-wide conservation percentages across family members
Family-wide entropy values for sequence variability
Initial detailed annotations
Simplified functional categories
CR/SR classification with thresholds

Figures

The Figures/ directory contains all manuscript figures:

Figure 1: Automated ortholog identification and conservation analysis workflow
Figure 2: Family-wide conservation and structural mapping for Class A, olfactory, and Class T receptors
Figure 3: Conservation and structural organization for Class B1 and adhesion (B2) GPCRs
Figure 4: Evolutionary conservation patterns in Class F GPCRs across distinct domains
Figure 5: Evolutionary conservation patterns in Class C GPCRs across different domains
Figure 6: Subtype-specific selective residues in ligand and transducer selectivity

Code and Data

The code_and_data/ directory contains all computational analysis files:

benchmark/

Interactive HTML visualizations for benchmarking evolutionary predictions against experimental data
4 visualization files:
- FigS1_DMS_benchmark.html - Deep mutational scanning data (ADRB2, MC4R, GPR68, V2R)
- FigS2_correlation_plots.html - Correlation analyses across experimental datasets
- FigS3_evolutiıon_benchmark.html - Evolution-based functional predictions (OPSD, Sanders)
- Fig2cde.html - Combined view (ClinVar, mutational intolerance, activation network)
Consolidated mapping: Single unified GPCR residue mapping file for mapping class A receptors to HRH2
Data folder: Experimental datasets including DMS studies, clinical variants, and evolutionary data
Analysis scripts: Python scripts for calculating combined mutational intolerance scores
See benchmark/README.md for detailed documentation

ortholog_pipeline/

Complete pipeline for identifying GPCR orthologs
Python scripts for sequence processing and phylogenetic analysis
Human protein sequences organized by GPCR class
SLURM scripts for cluster execution

CRs_and_SRs_on_structure/

Structural visualizations of CR/SR mapping
PyMOL session files for interactive 3D analysis
AlphaFold2 models and experimental structures
PNG images used as manuscript figures

scatter_plots_and_CR_SR_calculation/

Scripts for generating scatter plots
CR/SR calculation algorithms
Label files with simplified annotations
Special analysis for Class F GPCRs
Class alignments: Multiple sequence alignments used for receptor mapping and analysis

Usage

Accessing Data

Open Supplementary_Table.xlsx for comprehensive analysis data
Navigate to specific class worksheets for detailed information
Use conservation and entropy values for further analysis

Viewing Figures

Open PNG files in any image viewer
Figures correspond to manuscript figures with same numbering

Interactive Benchmark Visualizations

Navigate to code_and_data/benchmark/
Open any HTML file in a modern web browser (Chrome, Firefox, Edge)
Explore interactive plots:
- FigS1: Deep mutational scanning benchmarks (10 panels)
- FigS2: Correlation analyses across datasets (20 panels)
- FigS3: Evolution-based predictions (2 panels)
- Fig2cde: Combined horizontal view (3 panels)
Features: hover for details, click legend to toggle, pan and zoom
All data loads locally - no internet connection required

Running Analysis

Navigate to code_and_data/ for computational scripts
Follow README files in each subdirectory for specific instructions
Use provided SLURM scripts for cluster execution
Run benchmark/data/calculate_combined_significance.py to calculate combined mutational intolerance scores

Alignment Strategy

The analysis uses representative sequences ("reps") and multiple sequence alignments to determine receptor sets and mapping:

Representative Sequence Selection

CD-HIT clustering identifies representative sequences from ortholog sets
Top 5 largest clusters are selected as representatives
Representatives provide evolutionary diversity while reducing computational complexity

Alignment Process

Initial Alignment: Representative sequences + human sequences aligned using MAFFT
Human Extraction: Human sequences extracted from the alignment
Final Analysis: Family-wide analysis performed on human-only alignments

Alignment Types

Full alignments: Complete receptor sequences with representatives
Human-only alignments: Extracted human sequences for family-wide analysis
Domain-specific alignments: Specific regions (VFT domain, CRD domain, etc.)

Impact of Alignment Choice

Different alignments produce different outcomes:

Receptor Set: Different representative sequences change receptor composition
Mapping: Alignment positions determine residue numbering and conservation calculations
Analysis Scope: Domain-specific alignments focus on particular functional regions

Alignment Files

The class_alignments/ directory contains:

Standard alignments with representatives
Human-only extracted alignments
Domain-specific alignments for particular regions
Scripts for extracting human sequences from representative alignments

Data Availability

All data and code are available at: https://github.com/CompGenomeLab/GPCR_Family_Divergence

Contact

For questions or issues with the data or code, please contact:

Berkay Selcuk: selcuk.1@buckeyemail.osu.edu
Ogün Adebali: oadebali@sabanciuniv.edu

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Figures		Figures
code_and_data		code_and_data
LICENSE		LICENSE
README.md		README.md
Supplementary Figures.docx		Supplementary Figures.docx
Supplementary_Table_1.xlsx		Supplementary_Table_1.xlsx
Supplementary_Table_2.xlsx		Supplementary_Table_2.xlsx
Supplementary_Tables_All_Families.xlsx		Supplementary_Tables_All_Families.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Decoding Functional Specialization in GPCRs through Evolution-Guided Residue Profiling - Supplementary Information

Directory Structure

Supplementary Table

Data Content

Simplified Annotation Categories

Data Organization

Figures

Code and Data

benchmark/

ortholog_pipeline/

CRs_and_SRs_on_structure/

scatter_plots_and_CR_SR_calculation/

Usage

Accessing Data

Viewing Figures

Interactive Benchmark Visualizations

Running Analysis

Alignment Strategy

Representative Sequence Selection

Alignment Process

Alignment Types

Impact of Alignment Choice

Alignment Files

Data Availability

Contact

About

Uh oh!

Releases

Packages

Languages

License

CompGenomeLab/GPCR_Family_Divergence

Folders and files

Latest commit

History

Repository files navigation

Decoding Functional Specialization in GPCRs through Evolution-Guided Residue Profiling - Supplementary Information

Directory Structure

Supplementary Table

Data Content

Simplified Annotation Categories

Data Organization

Figures

Code and Data

benchmark/

ortholog_pipeline/

CRs_and_SRs_on_structure/

scatter_plots_and_CR_SR_calculation/

Usage

Accessing Data

Viewing Figures

Interactive Benchmark Visualizations

Running Analysis

Alignment Strategy

Representative Sequence Selection

Alignment Process

Alignment Types

Impact of Alignment Choice

Alignment Files

Data Availability

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages