Skip to content

Comments

feat: scaffold for all annotations with reasonable structure#13

Merged
shloknatarajan merged 47 commits intoDaneshjouLab:mainfrom
shloknatarajan:main
Jul 1, 2025
Merged

feat: scaffold for all annotations with reasonable structure#13
shloknatarajan merged 47 commits intoDaneshjouLab:mainfrom
shloknatarajan:main

Conversation

@shloknatarajan
Copy link
Member

@shloknatarajan shloknatarajan commented Jul 1, 2025

Annotation pipeline works and saves outputs

devin-ai-integration bot and others added 30 commits June 28, 2025 17:05
…ations

- Implement module-level caching for get_true_variants() to avoid repeated JSON file loading
- Fix type annotations across multiple files to use Optional[T] instead of T = None
- Add comprehensive efficiency analysis report documenting all identified issues
- Add test script to verify caching functionality works correctly

This addresses the critical efficiency issue where JSON files were loaded on every
function call, causing unnecessary disk I/O operations. The caching implementation
uses lazy loading with proper error handling for missing files.

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Add DrugAnnotation and DrugAnnotationList models to src/variants.py
- Create new drug_annotation_extraction.py component with detailed field extraction
- Integrate drug annotation extraction into variant association pipeline
- Add comprehensive test script for verification
- Follow existing LLM infrastructure patterns (Generator/Parser)
- Extract detailed pharmacogenomic fields matching provided schema

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Modified extract_drug_annotations to loop through variants one at a time
- Each variant now gets individual LLM processing for better control
- Added SingleDrugAnnotation model for individual variant processing
- Updated logging to show individual variant processing progress
- Maintains same output quality while providing cleaner extraction per variant
- Updated test script to reflect individual processing approach

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
…ation-extraction

Add drug annotation extraction component for variants with drug associations
…-improvements

Efficiency improvements: Cache JSON loading and fix type annotations
…ents

- Add PhenotypeAnnotation and FunctionalAnnotation data models to variants.py
- Create phenotype_annotation_extraction.py with detailed extraction logic
- Create functional_annotation_extraction.py with mechanistic annotation logic
- Update variant_association_pipeline.py to integrate new extraction components
- Follow existing drug annotation extraction patterns
- Use detailed prompt templates from annotation_prompts.md
- Process variants individually for better control and cleaner extraction
- Include proper error handling and logging throughout

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Move test_imports.py and test_new_annotations.py to tests/ directory
- Clean up temporary converted notebook file
- Follow proper repository structure conventions

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Fix Python path resolution for tests running from tests/ subdirectory
- Ensure tests can properly import src modules from new location
- Verify all tests pass after organizational changes

Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
…functional-extraction

feat: implement phenotype and functional annotation extraction components
@shloknatarajan shloknatarajan self-assigned this Jul 1, 2025
@shloknatarajan shloknatarajan added the WIP Work In Progress label Jul 1, 2025
@shloknatarajan shloknatarajan removed the WIP Work In Progress label Jul 1, 2025
@shloknatarajan shloknatarajan merged commit 627619f into DaneshjouLab:main Jul 1, 2025
3 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant