Skip to content

Checkpoint support#9

Closed
gtsitsiridis wants to merge 30 commits intomainfrom
checkpoint_support
Closed

Checkpoint support#9
gtsitsiridis wants to merge 30 commits intomainfrom
checkpoint_support

Conversation

@gtsitsiridis
Copy link
Copy Markdown
Contributor

This pull request introduces several important improvements and new features to the PROTRIDER codebase, focusing on model checkpointing, configuration flexibility, training stability, and logging. The most significant changes are summarized below.

Model Checkpointing and Configuration Enhancements:

  • Added support for model checkpointing via a new checkpoint_path config option, allowing users to save and reuse trained models, with documentation and config file updates (README.md, config.yaml, src/protrider/config.py). [1] [2] [3]
  • Updated configuration options to include early stopping parameters (patience, min_delta), common degrees of freedom for statistical testing, and more flexible latent dimension selection (find_q_method now accepts "bs"). [1] [2] [3]

Training Stability and Logging:

  • Added optional Weights & Biases (wandb) logging support, including config options, dependency management, and logging hooks in training loops. [1] [2] [3] [4]

File Handling and Output Improvements:

  • Improved handling of Parquet files by specifying the fastparquet engine and disabling index inference for consistency. [1] [2]
  • Enhanced CLI output: now saves fit parameters and grid search results when available. [1] [2]

Internal Refactoring and API Additions:

  • Refactored model initialization and PCA-based weight initialization for improved clarity and flexibility with multiple layers. [1] [2]
  • Added a GridSearchResult dataclass for structured grid search output and saving results to CSV. [1] [2] [3]

These changes collectively improve user experience, reproducibility, and model training robustness.

George Tsitsiridis and others added 28 commits January 22, 2026 11:44
… class and update related logic in the pipeline"

This reverts commit 72717d9.
…rning rate scheduler in training functions
Copilot AI review requested due to automatic review settings April 8, 2026 15:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds model checkpointing and optional Weights & Biases logging, refactors residual-fitting/p-value computation to surface fitted parameters, and extends latent-dimension selection (incl. binary search) while updating CLI outputs and configuration.

Changes:

  • Add checkpoint save/load support and optional wandb logging during training.
  • Refactor stats fitting to return structured fit parameters; add grid-search results export.
  • Update config/CLI/docs/tests and dependency metadata to support new options (wandb extra, new config fields).

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
uv.lock Adds wandb extra and its transitive deps; updates resolution markers.
tests/test_pipeline_standard.py Adds assertions for a degrees-of-freedom output file.
tests/test_pipeline_features.py Adds a test for non-common degrees of freedom using t-distribution.
tests/test_pipeline_cv.py Removes cross-validation pipeline tests.
tests/test_model_save_load.py Introduces checkpoint save/load and reuse tests.
tests/test_config.py Removes CV-related fields from “all fields” config test.
src/protrider/stats.py Introduces FitParameters dataclass; changes residual fitting / p-value API.
src/protrider/pipeline.py Adds checkpoint IO + wandb integration; changes run() return signature; removes CV runner code.
src/protrider/model/model.py Refines multi-layer module construction, PCA init for multi-layer, and early stopping/scheduler/wandb hooks in training.
src/protrider/model/model_helper.py Adds GridSearchResult, returns it from find_latent_dim, and adds “bs” search method.
src/protrider/model/__init__.py Re-exports GridSearchResult.
src/protrider/datasets/protein_intensities.py Changes Parquet read settings (engine specified).
src/protrider/config.py Adds checkpoint, wandb, early stopping, and common DF options; removes CV options; adds “bs” to q-method validation.
src/protrider/cli.py Updates CLI to handle and persist fit parameters + grid search results.
README.md Documents checkpoint_path and checkpoint behavior.
pyproject.toml Adds wandb as an optional dependency group.
config.yaml Adds checkpoint + wandb config knobs; adds common DF flag; removes CV section.
.gitignore Ignores wandb outputs and checkpoint .pt files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

George Tsitsiridis added 2 commits April 8, 2026 18:38
@gtsitsiridis
Copy link
Copy Markdown
Contributor Author

@copilot check updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants