ddharmon

Python client for the BioMapper2 API — map biological entity names to standardized knowledge-graph identifiers (CHEBI, HMDB, PubChem, RefMet, and more).

from ddharmon import map_entity

result = map_entity("L-Histidine")
print(result.primary_curie)     # RM:0129894
print(result.confidence_tier)   # high
print(result.ids_for("CHEBI"))  # ['15971']

Installation

# Core (async HTTP client + Pydantic models)
pip install ddharmon

Getting an API key

The BioMapper2 API requires an API key. To request access, email trent.leslie@phenomehealth.org.

Once you have a key, set it in your environment:

export BIOMAPPER_API_KEY=your-key-here

Or add it to a .env file in your project root:

BIOMAPPER_API_KEY=your-key-here

ddharmon will pick it up automatically from either location.

Quick start

Single lookup (synchronous)

from ddharmon import map_entity

result = map_entity("L-Histidine")

print(result.resolved)          # True
print(result.primary_curie)     # RM:0129894
print(result.chosen_kg_id)      # CHEBI:15971
print(result.confidence_score)  # 2.489
print(result.confidence_tier)   # high  (≥2.0)
print(result.ids_for("CHEBI"))  # ['15971']
print(result.ids_for("refmet_id"))  # ['RM0129894']

Batch mapping (synchronous)

from ddharmon import map_entities, summarize

records = [
    {"name": "L-Histidine"},
    {"name": "Glucose", "identifiers": {"HMDB": "HMDB00122"}},
    {"name": "Sphinganine"},
]

results = map_entities(records, progress=True)  # tqdm bar with [notebook]
summary = summarize(results)

print(f"{summary.resolved}/{summary.total_queried} resolved")
print(f"Resolution rate: {summary.resolution_rate:.1%}")
print(summary.vocabulary_coverage)

Async usage

import asyncio
from ddharmon import BioMapperClient

async def main() -> None:
    async with BioMapperClient() as client:
        # Verify connectivity
        health = await client.health_check()
        print(health)  # {'status': 'healthy', ...}

        # Single
        result = await client.map_entity(
            "L-Histidine",
            identifiers={"HMDB": "HMDB00177"},
        )

        # Batch with rate limiting
        results = await client.map_entities(
            [{"name": "L-Histidine"}, {"name": "Glucose"}],
            rate_limit_delay=0.3,
            progress=True,
        )

asyncio.run(main())

Jupyter notebooks

Apply nest_asyncio before using sync helpers inside a running event loop:

import nest_asyncio
nest_asyncio.apply()  # required in Jupyter

from ddharmon import map_entities
results = map_entities([{"name": "L-Histidine"}], progress=True)

Preprocessing functions

from ddharmon.extras.metabolon import clean_compound_name, extract_hmdb_id

# Strip quotes and collision-energy suffixes
clean_compound_name('"1,3-Diphenylguanidine_CE45"')  # '1,3-Diphenylguanidine'
clean_compound_name('"4,6-DIOXOHEPTANOIC ACID"')     # '4,6-DIOXOHEPTANOIC ACID'
clean_compound_name('L-Histidine')                   # 'L-Histidine'  (unchanged)

# Extract HMDB accessions from ms1_compound_name format
extract_hmdb_id('HMDB:HMDB03349-2257 L-Dihydroorotic acid')  # 'HMDB03349'
extract_hmdb_id('HMDB00177')                                  # 'HMDB00177'
extract_hmdb_id(None)                                         # None

API reference

`MappingResult`

Attribute	Type	Description
`query_name`	`str`	Name submitted to the API
`resolved`	`bool`	Whether any identifier was returned
`primary_curie`	`str \| None`	First CURIE in the response
`chosen_kg_id`	`str \| None`	Resolver-selected knowledge graph ID
`confidence_score`	`float \| None`	Highest score across annotators
`confidence_tier`	`str`	`"high"` (≥2.0) / `"medium"` (1–2) / `"low"` (<1) / `"unknown"`
`identifiers`	`dict[str, list[str]]`	Vocabulary → IDs, e.g. `{"CHEBI": ["15971"]}`
`hmdb_hint`	`str \| None`	HMDB hint passed in the request
`error`	`str \| None`	Error message if mapping failed

result.ids_for("CHEBI")        # ['15971']
result.ids_for("refmet_id")    # ['RM0129894']
result.ids_for("PUBCHEM.COMPOUND")  # []

Confidence tiers

Score	Tier	Recommended action
≥ 2.0	`high`	Accept without review
1.0–2.0	`medium`	Quick sanity check
< 1.0	`low`	Manual review recommended
`None`	`unknown`	No score returned (e.g. HMDB-hint resolved)

Error handling

from ddharmon import (
    BioMapperError,       # base class
    BioMapperAuthError,   # 401/403 — bad API key
    BioMapperRateLimitError,  # 429 — throttled
    BioMapperServerError,     # 5xx
    BioMapperTimeoutError,    # request timeout
    BioMapperConfigError,     # missing API key / bad config
)

try:
    result = map_entity("Glucose")
except BioMapperRateLimitError as e:
    print(f"Throttled. Retry after: {e.retry_after}s")
except BioMapperAuthError:
    print("Check your BIOMAPPER_API_KEY")

In batch mode (map_entities), per-record errors are caught and returned as MappingResult(error=...) rather than aborting the batch.

Development

git clone https://github.com/trentleslie/ddharmon
cd ddharmon
poetry install --with dev --extras all

make check          # format → lint → type-check → test
make test           # tests only
make coverage       # HTML coverage report

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
notebooks		notebooks
src/ddharmon		src/ddharmon
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ddharmon

Installation

Getting an API key

Quick start

Single lookup (synchronous)

Batch mapping (synchronous)

Async usage

Jupyter notebooks

Preprocessing functions

API reference

`MappingResult`

Confidence tiers

Error handling

Development

License

Related

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ddharmon

Installation

Getting an API key

Quick start

Single lookup (synchronous)

Batch mapping (synchronous)

Async usage

Jupyter notebooks

Preprocessing functions

API reference

MappingResult

Confidence tiers

Error handling

Development

License

Related

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`MappingResult`

Packages