AutoGKB Benchmark System

A comprehensive benchmarking system for evaluating the quality of genomic knowledge base annotations.

Overview

This repository contains two versions of the AutoGKB benchmark system:

Benchmark V1: The original, comprehensive benchmark that evaluates four types of annotations.
Benchmark V2: A newer, more modular benchmark that is currently focused on variant matching and sentence-level validation.

Benchmark V1

The original benchmark system evaluates four types of annotations:

Drug Annotations (var_drug_ann) - Drug-gene-variant associations
Phenotype Annotations (var_pheno_ann) - Phenotype-gene-variant associations
Functional Analysis (var_fa_ann) - Functional effects of variants
Study Parameters (study_parameters) - Study design and statistical parameters

Quick Start (V1)

Run benchmark on all files:

PYTHONPATH=src pixi run python src/benchmark_v1/run_benchmark.py

Run benchmark on a single file:

PYTHONPATH=src pixi run python src/benchmark_v1/run_benchmark.py --single_file PMC5508045

Show detailed mismatches:

PYTHONPATH=src pixi run python src/benchmark_v1/run_benchmark.py \
    --single_file PMC5508045 \
    --show_mismatches

V1 Scoring

The overall score is a weighted average of individual benchmark scores. Each field is scored using appropriate comparison metrics, including exact match, semantic similarity, and numeric tolerance. The system also performs dependency validation to penalize logical inconsistencies.

For more details on the V1 benchmark, please refer to the original README_BENCHMARK.md.

Benchmark V2

The V2 benchmark is a newer, more modular system designed for focused evaluations. It currently includes benchmarks for variant matching, sentence validation, and field extraction.

Variant Benchmark (V2)

The variant benchmark (variant_bench.py) compares a list of proposed variants against a ground truth set, calculating match rates, misses, and extras.

Quick Start (V2)

Score variants from a proposed annotation file:

PYTHONPATH=src pixi run python src/benchmark_v2/variant_bench.py score_annotation <path_to_annotation_file>

Score all annotations in a directory:

PYTHONPATH=src pixi run python src/benchmark_v2/variant_bench.py score_all_annotations --annotations_dir <path_to_annotations_dir>

Score variants from a generated JSON file:

PYTHONPATH=src pixi run python src/benchmark_v2/variant_bench.py score_generated_variants <path_to_generated_variants_file>

Sentence Benchmark (V2)

The sentence benchmark (sentence_bench.py) evaluates the quality of generated sentences against ground truth sentences from the literature.

Field Extractor (V2)

The field extractor (field_extractor.py) is a utility for extracting specific fields from annotation files.

V2 Output

The V2 variant benchmark provides a JSON output with the following structure:

{
  "timestamp": "",
  "run_name": "",
  "total_match_rate": 0.0,
  "per_annotation_scores": [
    {
      "pmcid": "PMC5508045",
      "title": "",
      "match_rate": 0.0,
      "matches": [],
      "misses": [],
      "extras": []
    }
  ]
}

Experiments

The src/modules directory contains modular, multi-stage pipelines for developing and evaluating new methods for knowledge extraction. Each stage is designed to be run independently, with outputs from one stage feeding into the next.

The primary experimental pipelines are:

Variant Finding: Extracts genetic variants from full-text articles.
Sentence Generation: Generates sentences describing the clinical significance of each variant.
Citation Finding: Identifies the source sentence from the original article that supports each generated sentence.
Summary Generation: Creates a final, concise summary of the key pharmacogenomic findings in the article.

Each experiment directory contains a detailed README.md with instructions on how to run the specific pipeline, including example commands and descriptions of available methods. Please refer to these files for more information on each stage of the experimental process.

Dependencies

Required packages are managed with pixi and are listed in the pixi.toml file. Key dependencies include:

sentence-transformers
scikit-learn
numpy
pydantic

To install all dependencies, run:

pixi install

Contributing

When adding new features:

Add new benchmark modules to the src/benchmark_v2 directory.
Ensure that new features are tested.
Document new metrics and functionalities in this README.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
data		data
notebooks		notebooks
persistent_data		persistent_data
scratch		scratch
src		src
.gitignore		.gitignore
README.md		README.md
pixi.lock		pixi.lock
pixi.toml		pixi.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoGKB Benchmark System

Overview

Benchmark V1

Quick Start (V1)

Run benchmark on all files:

Run benchmark on a single file:

Show detailed mismatches:

V1 Scoring

Benchmark V2

Variant Benchmark (V2)

Quick Start (V2)

Score variants from a proposed annotation file:

Score all annotations in a directory:

Score variants from a generated JSON file:

Sentence Benchmark (V2)

Field Extractor (V2)

V2 Output

Experiments

Dependencies

Contributing

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

DaneshjouLab/autogkb-benchmark

Folders and files

Latest commit

History

Repository files navigation

AutoGKB Benchmark System

Overview

Benchmark V1

Quick Start (V1)

Run benchmark on all files:

Run benchmark on a single file:

Show detailed mismatches:

V1 Scoring

Benchmark V2

Variant Benchmark (V2)

Quick Start (V2)

Score variants from a proposed annotation file:

Score all annotations in a directory:

Score variants from a generated JSON file:

Sentence Benchmark (V2)

Field Extractor (V2)

V2 Output

Experiments

Dependencies

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages