Skip to content

Conversation

@alinakbase
Copy link
Collaborator

This refactor introduces a cleaner, modular, and more maintainable architecture for UniProt data parsing within the cdm_data_loader_utils package. The new design separates concerns across multiple parser components, centralizes shared identifier extraction, enhances XML utilities, and adds a comprehensive test suite to ensure long-term stability.

Key improvements include:
• Modular parser structure under cdm_data_loader_utils/parsers/
• Unified shared identifier extraction (shared_identifiers.py)
• Robust XML parsing utilities (xml_utils.py)
• Refactored UniProt parser (uniprot.py) with clearer logic paths
• Complete tests for UniProt refactor, including:
• shared identifiers
• XML utilities
• UniProt entry parsing
• Cleaner directory layout aligned with CDM conventions

This refactor provides a foundation for future expansion (features, evidence, associations, and publications) while improving maintainability and reducing duplicated logic.

@ialarmedalien ialarmedalien changed the base branch from main to develop December 4, 2025 16:27
@ialarmedalien ialarmedalien changed the base branch from develop to main December 4, 2025 16:30
@ialarmedalien ialarmedalien force-pushed the uniprot-refactor-v2 branch 2 times, most recently from 3b89f65 to 2e45b47 Compare December 4, 2025 17:03
@ialarmedalien ialarmedalien changed the base branch from main to develop December 4, 2025 17:03
@ialarmedalien ialarmedalien force-pushed the uniprot-refactor-v2 branch 3 times, most recently from 2a781b3 to bba5e5a Compare December 10, 2025 22:17
@alinakbase alinakbase force-pushed the uniprot-refactor-v2 branch 2 times, most recently from ec65f68 to bfbf335 Compare December 22, 2025 23:37
if os.path.exists(tmp_path):
try:
os.remove(tmp_path)
except Exception:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants