Conversation
There was a problem hiding this comment.
Pull request overview
This PR prepares the 0.8.1 release of the AOP MCP server, expanding the “scientific review” surface area (assay ranking, gene resolution, orphan discovery, draft review workflows) and updating the runtime contracts, docs, smoke tooling, and CI to match.
Changes:
- Adds HGNC-backed gene-symbol resolution, specificity-aware assay ranking, and caching improvements for CompTox/HGNC.
- Introduces new review/discovery tools (orphan stressor discovery, draft review bundle/artifact export/save/list, Linear handoff planning, chemical trace overlays) and registers them in the MCP tool registry.
- Updates JSON Schemas + schema-validation tests, docs/quickstarts, smoke script, CI, and bumps the version to 0.8.1.
Reviewed changes
Copilot reviewed 52 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_write_tools.py | Updates draft write-tool unit test to include/verify event_role. |
| tests/unit/test_settings.py | Adds Settings parsing test for HGNC configuration. |
| tests/unit/test_schema_validation.py | Adds schema validation regression payloads for new tools/responses. |
| tests/unit/test_publish_planners.py | Adds unit test for the new Linear document planner plan payload. |
| tests/unit/test_mcp_smoke.py | Extends tool registry smoke expectations for new tools and schema fields. |
| tests/unit/test_hgnc.py | Adds unit tests for HGNC client caching and identifier handling. |
| tests/unit/test_comp_tox.py | Adds cache-usage tests and specificity/rank-score assertions for CompTox client. |
| tests/unit/test_applicability_normalizer.py | Adds tests for lowest-common-taxon inference + blocked overly-broad LCAs. |
| tests/unit/test_aop_oecd_tools.py | Expands OECD tool tests for citation concordance, LCA inference, quantitative ordering, and validator checks. |
| src/tools/write/init.py | Extends KE write payload to accept/persist event_role. |
| src/tools/semantic/init.py | Exposes lowest_common_taxon from semantic tools. |
| src/services/publish/linear.py | Adds LinearDocumentPlanner/Plan for connector-ready document payloads. |
| src/services/publish/init.py | Exports Linear publish planner symbols from the publish package. |
| src/server/version.py | Bumps fallback app version string to 0.8.1. |
| src/server/tools/registry.py | Registers new MCP tools and wires schemas/input models/handlers. |
| src/server/dependencies.py | Injects HGNC client into AOPDB adapter; adds cached HGNC dependency. |
| src/server/config/settings.py | Adds HGNC + artifact output configuration to Settings. |
| src/semantic/applicability.py | Adds conservative taxonomic LCA inference with a default parent map + blocked taxa. |
| src/adapters/hgnc.py | Introduces HGNC REST client with caching and error handling. |
| src/adapters/comp_tox.py | Adds caching and specificity-aware ranking helpers (specificity_score, rank_score). |
| src/adapters/aop_db.py | Integrates HGNC resolution into KE assay search; adds orphan-stressor discovery and specificity-weighted ranking. |
| src/adapters/init.py | Re-exports HGNC client/error from adapters package. |
| scripts/test_mcp_endpoints.sh | Modernizes live smoke script; adds end-to-end draft review workflow checks and artifact/Linear steps. |
| README.md | Updates “What’s new” for v0.8.1; documents new tools and validated examples. |
| pyproject.toml | Version bump to 0.8.1. |
| docs/quickstarts/README.md | Links new “live scientific examples” quickstart. |
| docs/quickstarts/publish.md | Mentions Linear handoff planning in publish quickstart. |
| docs/quickstarts/oecd-draft-authoring.md | Expands draft workflow for roles/topology/directionality + new draft review tools. |
| docs/quickstarts/live-scientific-examples.md | Adds validated live-server example calls for key flows. |
| docs/mcp-fix-summary.md | Updates smoke/test summary to match current tool catalog and workflow. |
| docs/contracts/tool-catalog.md | Updates tool catalog documentation for new tools and expanded semantics. |
| docs/contracts/schemas/write/save_draft_review_artifact.response.schema.json | Adds schema for artifact save response (write path). |
| docs/contracts/schemas/read/trace_chemical_on_draft.response.schema.json | Adds schema for chemical-trace overlay response. |
| docs/contracts/schemas/read/search_assays_for_key_event.response.schema.json | Extends schema for structured gene identifiers and ranking fields. |
| docs/contracts/schemas/read/review_draft_evidence_gaps.response.schema.json | Adds schema for evidence-gap review response. |
| docs/contracts/schemas/read/review_draft_bundle.response.schema.json | Adds schema for unified draft review bundle response. |
| docs/contracts/schemas/read/review_draft_assay_cutoff_ordering.response.schema.json | Adds schema for draft assay-cutoff ordering review response. |
| docs/contracts/schemas/read/plan_linear_draft_review_document.response.schema.json | Adds schema for Linear document planning response. |
| docs/contracts/schemas/read/list_saved_draft_review_artifacts.response.schema.json | Adds schema for listing saved draft review artifacts. |
| docs/contracts/schemas/read/list_assays_for_query.response.schema.json | Extends query assay schema with specificity_score. |
| docs/contracts/schemas/read/list_assays_for_aops.response.schema.json | Extends multi-AOP assay schema with specificity_score. |
| docs/contracts/schemas/read/list_assays_for_aop.response.schema.json | Extends single-AOP assay schema with specificity_score. |
| docs/contracts/schemas/read/get_ker.response.schema.json | Extends KER schema with citation concordance + assay-cutoff ordering blocks. |
| docs/contracts/schemas/read/export_draft_review_artifact.response.schema.json | Adds schema for draft review artifact export response. |
| docs/contracts/schemas/read/discover_orphan_stressors_for_query.response.schema.json | Adds schema for query-driven orphan discovery response. |
| docs/contracts/schemas/read/discover_orphan_stressors_for_aops.response.schema.json | Adds schema for multi-AOP orphan discovery response. |
| docs/contracts/schemas/read/discover_orphan_stressors_for_aop.response.schema.json | Adds schema for single-AOP orphan discovery response. |
| docs/contracts/endpoint-matrix.md | Documents local artifact persistence/inventory integration. |
| .gitignore | Ignores generated artifact output directory. |
| .github/workflows/ci.yml | Adds lint + runtime-contract jobs to CI workflow. |
| .env.example | Adds HGNC configuration example environment variables. |
Comments suppressed due to low confidence (2)
docs/contracts/schemas/read/get_ker.response.schema.json:15
- The get_ker response schema defines
citation_concordance, and the tests expect it to always be present, but it is not listed in the top-levelrequiredarray. This weakens runtime contract validation (a regression could omit citation_concordance and still validate). Addcitation_concordancetorequired, or make it explicitly optional in both schema and tool output.
"required": [
"id",
"iri",
"upstream",
"downstream",
"applicability",
"assay_cutoff_ordering",
"evidence_blocks",
"references",
"provenance"
],
.env.example:20
- Artifact persistence is now configurable via Settings (
artifact_output_dir) and referenced in README asAOP_MCP_ARTIFACT_OUTPUT_DIR, but .env.example doesn’t include that variable. Adding it would make it easier for users to discover/override the new artifact output location.
# CompTox API configuration
AOP_MCP_COMPTOX_BASE_URL=https://comptox.epa.gov/dashboard/api/
AOP_MCP_COMPTOX_BIOACTIVITY_URL=https://comptox.epa.gov/ctx-api/
AOP_MCP_COMPTOX_API_KEY=replace-with-your-comptox-api-key
# HGNC gene-symbol resolution
AOP_MCP_HGNC_BASE_URL=https://rest.genenames.org/
AOP_MCP_HGNC_TIMEOUT=5.0
# Offline fallback for local development and tests
AOP_MCP_ENABLE_FIXTURE_FALLBACK=0
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def compute_specificity_score( | ||
| *, | ||
| multi_active: Any, | ||
| multi_total: Any, | ||
| single_active: Any, | ||
| single_total: Any, | ||
| ) -> float | None: | ||
| active, total = _normalize_activity_counts(multi_active, multi_total) | ||
| if total is None: | ||
| active, total = _normalize_activity_counts(single_active, single_total) | ||
| if total is None or total <= 0: | ||
| return None | ||
| active = min(max(active or 0, 0), total) | ||
| return 1.0 - (active / total) |
There was a problem hiding this comment.
compute_specificity_score() treats a missing active count as 0 when a total is present (via active or 0), which can incorrectly yield a specificity_score of 1.0 for assays where *_total is populated but *_active is absent/None. It would be safer to return None (or fall back to single-conc counts) unless both active and total are known, so unknown activity doesn’t inflate ranking.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 52 out of 54 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
docs/contracts/schemas/read/get_ker.response.schema.json:15
- The schema defines a
citation_concordanceobject and the server code appears to always include it inget_kerresponses, but it is not listed inrequired. Ifcitation_concordanceis part of the guaranteed contract (as implied by docs/tests), add it to therequiredlist so schema validation will catch accidental omissions.
"type": "object",
"required": [
"id",
"iri",
"upstream",
"downstream",
"applicability",
"assay_cutoff_ordering",
"evidence_blocks",
"references",
"provenance"
],
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def _fetch_orphan_assay_chemicals(self, aeid: int | str) -> list[dict[str, Any]] | list[Any]: | ||
| return await asyncio.wait_for( | ||
| self._call_comptox("get_chemicals_in_assay", str(aeid)), | ||
| timeout=self.orphan_assay_chemical_timeout_seconds, | ||
| ) |
There was a problem hiding this comment.
asyncio.wait_for() around _call_comptox(... to_thread ...) will time out the coroutine but will not stop the underlying worker thread. If CompTox calls hang or frequently exceed the timeout, this can leave orphaned threads doing work in the background and degrade server throughput. Prefer enforcing the timeout at the HTTP client level (httpx timeout) and/or use bounded concurrency + per-request timeouts instead of cancelling to_thread work.
| search_values: list[str] = [] | ||
| for stressor in stressors: | ||
| casrn = stressor.get("casrn") | ||
| if casrn: | ||
| index["casrns"].add(str(casrn)) | ||
| if str(casrn) not in search_values: | ||
| search_values.append(str(casrn)) | ||
| label = stressor.get("label") | ||
| normalized_label = _normalize_chemical_name(label) | ||
| if normalized_label: | ||
| index["names"].add(normalized_label) | ||
| if str(label) not in search_values: | ||
| search_values.append(str(label)) | ||
|
|
||
| warnings: list[str] = [] | ||
| if not search_values: | ||
| warnings.append( | ||
| "Linked stressors lacked searchable CAS RN and label values, so curated-chemical exclusion relies only on exact identifiers already present in assay results." | ||
| ) | ||
| return index, 0, warnings | ||
|
|
||
| search_tasks = [ | ||
| self._call_comptox("search_equal", search_value) | ||
| for search_value in search_values | ||
| ] | ||
| search_results = await asyncio.gather(*search_tasks, return_exceptions=True) | ||
| resolved_dtxsids: set[str] = set() |
There was a problem hiding this comment.
_build_curated_chemical_index() builds one _call_comptox('search_equal', ...) task per unique CAS/label and then gather()s them all at once. Because _call_comptox uses asyncio.to_thread, a large number of stressors can queue a very large number of threads/tasks and cause latency spikes or resource exhaustion. Consider capping the number of lookup values (e.g., only CASRNs, or a configurable max), and/or running these lookups through a semaphore/batched gather to bound concurrency.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 52 out of 54 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1,11 +1,14 @@ | |||
| """Publish planners for MediaWiki and OWL dry-run outputs.""" | |||
There was a problem hiding this comment.
The module docstring is now inaccurate: this package also exports LinearDocumentPlanner/LinearDocumentPlan, but the docstring still says planners are for MediaWiki and OWL only. Update the docstring to include Linear (or make it planner-agnostic) so readers and generated docs don’t get misled.
| - Added a coherent draft review workflow: `review_draft_bundle`, `review_draft_evidence_gaps`, `export_draft_review_artifact`, `save_draft_review_artifact`, `list_saved_draft_review_artifacts`, and `plan_linear_draft_review_document`. | ||
| - Added mechanistic discovery tooling for orphan stressor discovery across one AOP, multiple AOPs, and phenotype or mechanism queries, plus chemical trace overlays on draft graphs. | ||
| - Hardened live operations with a refreshed MCP smoke script, stronger CompTox caching and bounded concurrency, and real-server validation for KE assay search, orphan discovery, confidence review, and draft review/export flows. | ||
| - Added release-facing documentation for validated scientific examples in [docs/quickstarts/live-scientific-examples.md](/Volumes/Storage/topotox_space_relief_20260220/AOP_MCP/docs/quickstarts/live-scientific-examples.md). |
There was a problem hiding this comment.
This link points to an absolute local filesystem path (/Volumes/...) and will be broken for anyone reading the README in the repo. Replace it with a repository-relative link (e.g., docs/quickstarts/live-scientific-examples.md).
| - Added release-facing documentation for validated scientific examples in [docs/quickstarts/live-scientific-examples.md](/Volumes/Storage/topotox_space_relief_20260220/AOP_MCP/docs/quickstarts/live-scientific-examples.md). | |
| - Added release-facing documentation for validated scientific examples in [docs/quickstarts/live-scientific-examples.md](docs/quickstarts/live-scientific-examples.md). |
| metadata_tasks = [ | ||
| self._call_comptox("assay_by_aeid", candidate["aeid"]) | ||
| for candidate in assay_candidates.values() | ||
| ] | ||
| metadata_results = await asyncio.gather(*metadata_tasks) | ||
| for candidate, assay in zip(assay_candidates.values(), metadata_results): |
There was a problem hiding this comment.
assay_by_aeid metadata is fetched for all assay_candidates (up to 10 stressors × ~15 hits each) using an unbounded asyncio.gather. This significantly increases CompTox requests and can saturate the threadpool since _call_comptox uses asyncio.to_thread. Consider ranking first and only enriching the top N candidates, and/or run these calls through _gather_bounded(..., limit=self.comptox_concurrency_limit) to cap concurrency.
Summary
0.8.1release across the expanded scientific review surfaceIncluded in this release
0.8.1version bump, release hygiene updates, and a new live scientific examples quickstartVerification
./.venv/bin/python -m pytest -qBASE_URL=http://127.0.0.1:8011 ./scripts/test_mcp_endpoints.sh