Feat/standardize benchmarks by miagarvey · Pull Request #29 · DaneshjouLab/AutoGKB

miagarvey · 2025-11-13T20:13:05Z

Standardize fa, pheno, drug, study params benchmarks

⚙️ Release Notes

Create shared_utils.py with shared evaluation functions (exact_match, semantic_similarity, category_equal, compute_weighted_score); refactor FA, Drug, and Study Parameters benchmarks to use them.
Add optional field_weights, dependency validation (Drug: direction/association checks; Study Parameters: statistical consistency), and standardized allele handling (semantic_similarity) across all benchmarks.
Standardize Pheno benchmark: return detailed dict (0-1 scale) instead of single float (0-100), use field-specific evaluators from shared_utils, add field_weights, and update scaffold to accept ground truth directly (backward compatible with PMCID fallback).

By submitting this pull request, you agree to follow our Coding Guidelines:

I agree to follow the Coding Guidelines.

- Create shared_utils.py with common evaluation functions - Add configurable field_weights parameter to all benchmarks - Add dependency validation to Drug and Study Parameters benchmarks - Standardize allele handling to use semantic similarity - Add statistical consistency validation to Study Parameters - Refactor all benchmarks to use shared utilities - Maintain backward compatibility with existing code

…led results dict, 0-1 scale, shared_utils evaluators, configurable field_weights, direct ground truth parameter

…alysis summary of first 5 articles, fully removed variant id and study params id from study params score, modified pheno alignment

miagarvey added 4 commits November 6, 2025 09:08

add study parameters benchmark with variant ann ID alignment

4c38504

feat: Standardize pheno benchmark to match FA/Drug/StudyParams: detai…

6584e28

…led results dict, 0-1 scale, shared_utils evaluators, configurable field_weights, direct ground truth parameter

miagarvey closed this Nov 13, 2025

miagarvey reopened this Nov 13, 2025

miagarvey added 3 commits November 17, 2025 11:33

adding benchmark comparison script

2014773

updated alignment system to best match rather than using variant IDs

42705a6

displays detailed results with error explanations, added benchmark an…

c35b749

…alysis summary of first 5 articles, fully removed variant id and study params id from study params score, modified pheno alignment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Feat/standardize benchmarks#29

Feat/standardize benchmarks#29
miagarvey wants to merge 7 commits intomainfrom
feat/standardize-benchmarks

miagarvey commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

miagarvey commented Nov 13, 2025

Standardize fa, pheno, drug, study params benchmarks

⚙️ Release Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant