Drop docopt, Werkzeug, and obsolete deps#779
Merged
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Check dative bond types directly instead of comparing SMILES strings, which vary across RDKit versions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace deprecated setuptools.distutils.strtobool with string comparison - Migrate all CLI scripts from unmaintained docopt to argparse - Remove obsolete glob2 dependency (Python 3.10+ has this built-in) - Remove tensorflow from examples extras (restored per user request) - Update ReadTheDocs from EOL ubuntu-20.04 to ubuntu-22.04 - Update Node.js from EOL 16 to 22 in CI - Bump actions/setup-python v4->v5, setup-node v3->v4, codecov v1->v4, checkout v3->v4 - Update copyright year in docs to 2020-2026 - Remove unused .style.yapf (project uses black) - Remove accidental platform-specific binary artifact (.tar.gz) - Deduplicate _COMPOUND_STRUCTURAL_IDENTIFIERS into ord_schema.__init__ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use ord_schema.COMPOUND_STRUCTURAL_IDENTIFIERS directly at the call site in resolvers.py. The constant was never used in updates.py at all, so remove the unused import ord_schema there too. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolve docopt vs argparse conflicts by keeping argparse CLI; bump version to 0.3.100; drop duplicate test_dataset_missing_name; black-format merged scripts. Co-authored-by: Cursor <noreply@cursor.com>
…pendencies # Conflicts: # .github/workflows/publish.yml # .github/workflows/run_tests.yml # ord_schema/orm/rdkit_mappers.py # ord_schema/orm/scripts/add_datasets.py # ord_schema/resolvers.py # ord_schema/scripts/check_pb.py # ord_schema/scripts/parse_uspto.py # ord_schema/scripts/process_dataset.py # ord_schema/scripts/validate_dataset.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #779 +/- ##
==========================================
+ Coverage 71.31% 71.78% +0.47%
==========================================
Files 23 23
Lines 2426 2492 +66
Branches 565 567 +2
==========================================
+ Hits 1730 1789 +59
- Misses 583 590 +7
Partials 113 113
🚀 New features to boost your workflow:
|
Only one caller remains after updates.py dropped its copy, so the public constant in ord_schema.__init__ is no longer justified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- process_dataset: drop redundant dest= on --no-validate; _run_updates now takes explicit kwargs instead of the whole argparse Namespace - parse_uspto: move parse_args above main for consistency - resolvers: restore original constant order Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pylint was removed when .pylintrc was deleted; these disable comments have been dead weight since. Also simplify one no-op comprehension to list() while touching the file. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The only use was werkzeug.security.safe_join in id_filename() to build
a shard path like data/{shard}/{basename}. Replaced with posixpath.join
plus an isalnum() check on the shard, which is stricter than safe_join
and matches our actual (hex-like) dataset ID shape.
Werkzeug remains available transitively via tensorboard for the
examples extra, but ord-schema no longer pulls in a web framework as
a required runtime dependency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cies Fold the Werkzeug removal into the dependency cleanup PR; it's a tiny change and in the same spirit as dropping docopt + glob2. # Conflicts: # pyproject.toml
Mirrors what werkzeug.security.safe_join used to enforce: verify that the constructed path is normalized and still inside the data/ root. Defense-in-depth — today the shard isalnum check is the only way a traversal string could slip past os.path.basename + the "ord" prefix rule, but this guards against future regressions if either of those rules gets loosened. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After replacing werkzeug.security.safe_join with an explicit prefix + isalnum shard guard, lock in the rejection behavior for the shapes the guard is meant to catch: - "notord-..." — wrong "ord" prefix. - "ord-..foo" — shard becomes "..", the only known traversal vector that survives os.path.basename. - "ord-.foo" — shard becomes ".f", non-alphanumeric. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bdeadman
approved these changes
Apr 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docopt(unmaintained since 2019): migrate all CLI scripts + tests toargparseWerkzeug: its only use wassecurity.safe_joininid_filename(); replaced withposixpath.joinplus anisalnum()shard checksetuptools.distutils.strtoboolinrdkit_mappers.py(replaced with a plain env-var check)glob2(built-in since Python 3.10),.style.yapf(project usesruff)_COMPOUND_STRUCTURAL_IDENTIFIERSinresolvers.py; drop the duplicate fromupdates.py(unused)_run_updatesinprocess_dataset.pytakes explicit kwargs instead of the whole argparse Namespace# pylint:directives inparse_uspto.py(pylint was removed when.pylintrcwas deleted)docs/conf.pyto 2020–2026 and bump.readthedocs.ymltoubuntu-22.04Test plan
pytest -vv --cov=ord_schemapasses (especially*_test.pyfor migrated scripts)python -m ord_schema.scripts.validate_dataset --helpworks🤖 Generated with Claude Code