-
Notifications
You must be signed in to change notification settings - Fork 82
Description
Hi!
I am using the Tpocket module to validate how well fpocket identifies ligandable pockets in a dataset of crystallized ligands. I have a few questions regarding Tpocket’s output and interpretation.
1. Interpretation of p_stats.txt (POS6 Column)
Tpocket generates two types of output files for each of its six ranking criteria:
- stats_g.txt: General statistics across the dataset.
- stats_p.txt: Per-protein statistics, which includes a "POS6" column.
My first question is:
Does the "POS6" column in p_stats.txt indicate the rank of the actual ligand binding pocket?
If so, I should be able to aggregate the per-protein statistics and obtain results that match those in g_stats.txt. However, when I visually compare pocket rankings based on POS6 or fpocket’s default ranking, I notice frequent mispredictions of the ligand-binding pocket.
2. Discrepancies in Pocket Ranking
I tried to look into this by implementing the Multiple Overlap Criterion (MOC) as described in the fpocket paper. This criterion, (which should be identical to the POS6 criterium (?)) appears to assign ligand-binding pockets better than Tpocket’s built-in rankings. This however, should be identical. After visual inspection, I also find that Tpocket often assigns the ligand-binding pocket incorrectly.
Some numbers:
- according to stats_g.txt the consensus overlap criterion should be 0.86 for the top-1 pockets
- When I aggregate the stats_p.txt file on POS6, the TPR indeed is around 0.86 for the top-1 pockets
- However, when I run my implementation of the MOC the True Predictive Rate drops to 0.28 for the top-1 pockets
This aligns with my visual inspections, where the ligand’s actual binding pocket is often incorrectly assigned.
So in sum, my questions are:
- Am I misinterpreting the POS6 column?
- Are there additional factors influencing how Tpocket ranks pockets that I should consider?
- Is there perhaps a known issue with Tpocket's ranking approach?
Any insights into the workings of Tpocket would be greatly appreciated!
Thanks a lot!
Example Data
Below is an part from stats_g.txt, showing the general performance of the consensus overlap criteria:
--
- _ Concensus overlap criteria (alpha sphere overlap) _ -
--
Ratio of good predictions (dist = 3A)
Rank <= 1 : 0.86
Rank <= 2 : 0.90
Rank <= 3 : 0.91
Rank <= 4 : 0.91
Rank <= 5 : 0.91
Rank <= 6 : 0.91
Rank <= 7 : 0.92
Rank <= 8 : 0.92
Rank <= 9 : 0.92
Rank <= 10 : 0.92
Rank <= 15 : 0.92
Rank <= 20 : 0.92
Rank <= 50 : 0.92
Rank <= 100 : 0.92
Rank <= 200 : 0.92
-
Mean relative overlap : 74.55
Mean pocket volume (estimation) : 1061.59
Mean number of pocket atom : 62
This suggests that in 86% of cases, the top-ranked pocket is within 3Å of the ligand.
Then, a part from p_stats.txt, which provides per-protein rankings:
LIG COMPLEXE APO NB_PCK CRIT1 CRIT2 CRIT3 CRIT4 CRIT5 CRIT6 POS1 POS2 POS3 POS4 POS5 POS6 REL_OVLP1 REL_OVLP2 REL_OVLP3 REL_OVLP4 REL_OVLP5 REL_OVLP6 LIGMASS LIGVOL PVOL3 NATM3 PVOL6 NATM6
UNL 4m8x_protein_ligand_combined.pdb 4m8x_protein_ligand_combined.pdb 17 62.50 85.92 3.98 0.71 0.80 1.00 1 1 1 1 1 1 800.00 90.14 0.00 71.43 79.66 79.66 679.40 579.49 1313.84 64 1313.84 64
UNL 4mr3_protein_ligand_combined.pdb 4mr3_protein_ligand_combined.pdb 6 100.00 89.19 3.10 0.88 0.59 1.00 1 1 1 1 1 1 2550.00 137.84 0.00 87.50 59.21 59.21 308.20 233.80 990.88 51 990.88 51
UNL 5mrb_protein_ligand_combined.pdb 5mrb_protein_ligand_combined.pdb 20 100.00 66.15 3.70 0.74 0.82 1.00 1 1 1 1 1 1 2600.00 80.00 0.00 74.42 81.94 81.94 540.41 523.31 1149.40 52 1149.40 52
UNL 3nyx_protein_ligand_combined.pdb 3nyx_protein_ligand_combined.pdb 24 100.00 100.00 2.75 1.00 0.61 1.00 1 1 1 1 1 1 2275.00 165.45 0.00 100.00 60.65 60.65 395.85 278.62 1460.10 91 1460.10 91
UNL 4w97_protein_ligand_combined.pdb 4w97_protein_ligand_combined.pdb 29 57.14 74.55 0.78 0.62 0.37 1.00 2 1 2 1 1 1 457.14 189.09 189.09 62.50 37.23 37.23 935.39 664.84 655.70 32 3973.62 208
UNL 2bmk_protein_ligand_combined.pdb 2bmk_protein_ligand_combined.pdb 23 57.14 87.50 2.77 0.81 0.64 1.00 1 1 1 1 1 1 571.43 125.00 0.00 80.95 63.93 63.93 303.10 199.65 790.61 40 790.61 40
UNL 6c2r_protein_ligand_combined.pdb 6c2r_protein_ligand_combined.pdb 47 100.00 95.12 2.41 1.00 0.71 1.00 1 1 1 1 1 1 1966.67 143.90 0.00 100.00 70.53 70.53 463.74 432.10 1131.84 59 1131.84 59
UNL 4ayu_protein_ligand_combined.pdb 4ayu_protein_ligand_combined.pdb 62 0.00 0.00 0.00 0.00 0.00 0.00 0 0 0 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 146.08 120.91 0.00 -1 0.00 -1
UNL 5oht_protein_ligand_combined.pdb 5oht_protein_ligand_combined.pdb 36 100.00 100.00 0.00 1.00 0.31 1.00 1 1 0 1 1 1 1620.00 324.00 0.00 100.00 30.66 30.66 198.13 121.79 0.00 -1 1216.02 81
Despite POS6 suggesting that these pockets rank first, the actual ligand binding pocket is frequently incorrectly assigned