Skip to content

Fix bugs and improve robustness of ROCS scoring#19

Open
fulopjoz wants to merge 4 commits intoCDDLeiden:devfrom
fulopjoz:fix/rocs-scoring-followup
Open

Fix bugs and improve robustness of ROCS scoring#19
fulopjoz wants to merge 4 commits intoCDDLeiden:devfrom
fulopjoz:fix/rocs-scoring-followup

Conversation

@fulopjoz
Copy link

Summary

Follow-up fixes for the ROCS scoring module merged in #15. Addresses several bugs found during production use of the OpenEye ROCS scorer and improves robustness of conformer generation.

Changes

OpenEye ROCS scorer (rocs_openeye.py):

  • Fix -shapeonly flag always passing "false" instead of "true" when enabled
  • Wire color_optimize parameter to the -optchem CLI flag (was silently ignored)
  • Add SMILES deduplication before conformer generation to avoid redundant work
  • Fix score remapping so duplicate SMILES receive identical scores
  • Make subprocess timeout configurable instead of hardcoded 300s
  • Remove OMP_NUM_THREADS="1" environment override from subprocess call

RDKit conformer generator (conformer_generators.py):

  • Add timeout parameter for ETKDGv3 embedding to prevent pathological molecules from blocking the pipeline

Tutorial scripts (tutorial/advanced/rocs/):

  • Resolve paths relative to script location (__file__) instead of current working directory so scripts work when invoked from any directory

RDKit ROCS scorer (rocs_rdkit.py):

  • Document alignment behavior of _score_single_reference (Gaussian shape overlay vs graph-based ROCS)

@martin-sicho

Tutorial scripts used Path.cwd() to build paths, causing failures when
invoked from a different directory. Now resolve paths relative to the
script file location via Path(__file__).resolve().parent.
Add docstring to _score_single_reference explaining Gaussian shape
overlay, the opt_param choices, and benchmarking results showing
that optimization objective has no effect on final TanimotoCombo
scores when sufficient conformer pairs are sampled.
Add configurable per-isomer timeout (default 120s) for ETKDGv3
embedding to prevent pathological molecules from blocking the
conformer generation pipeline indefinitely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant