Skip to content

Conversation

@bejaeger
Copy link
Contributor

@bejaeger bejaeger commented Dec 31, 2025

Changes

Adds regressor fine-tuning. Adds common base class for finetuning classifier and regressor

Follow-ups

  • Add RPS loss as an option

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring by adding a common base class FinetunedTabPFNBase for fine-tuning, and implementing FinetunedTabPFNRegressor alongside the refactored FinetunedTabPFNClassifier. The use of dataclasses for batch data greatly improves code clarity. The changes are well-structured and enhance maintainability. I have one suggestion regarding a previously existing safety check for classification tasks that seems to have been lost during the refactoring.

@bejaeger bejaeger marked this pull request as ready for review January 2, 2026 10:51
@bejaeger bejaeger requested a review from a team as a code owner January 2, 2026 10:51
@bejaeger bejaeger requested review from brendan-priorlabs and Copilot and removed request for a team January 2, 2026 10:51
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds regressor fine-tuning functionality to TabPFN and introduces a common base class to share training logic between classifier and regressor fine-tuning.

Key changes:

  • Created FinetunedTabPFNBase abstract base class containing shared fine-tuning logic
  • Added FinetunedTabPFNRegressor with MSE-based evaluation and bar distribution loss
  • Refactored FinetunedTabPFNClassifier to inherit from the base class
  • Introduced ClassifierBatch and RegressorBatch dataclasses for clearer batch structure
  • Updated tests to use new batch dataclasses and added regressor-specific tests

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/tabpfn/finetuning/finetuned_base.py New abstract base class containing shared fine-tuning loop, optimizer setup, learning rate scheduling, and early stopping logic
src/tabpfn/finetuning/finetuned_regressor.py New regressor fine-tuning wrapper implementing bar distribution loss with optional MSE auxiliary term
src/tabpfn/finetuning/finetuned_classifier.py Refactored to extend base class, removed duplicate training logic, unified attribute naming to use finetuned_estimator_
src/tabpfn/finetuning/data_util.py Added ClassifierBatch and RegressorBatch dataclasses, updated collation logic to work with structured batches
src/tabpfn/finetuning/init.py Added exports for new public API classes
tests/test_finetuning_regressor.py Streamlined tests focusing on regressor-specific functionality with mocked forward passes
tests/test_finetuning_classifier.py Updated tests to use new ClassifierBatch dataclass structure
examples/finetune_regressor.py Added example demonstrating regressor fine-tuning on California Housing dataset

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bejaeger
Copy link
Contributor Author

bejaeger commented Jan 2, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fine-tuning wrapper for regressors, FinetunedTabPFNRegressor, and refactors the existing classifier fine-tuning logic into a common base class, FinetunedTabPFNBase. This is an excellent architectural improvement that significantly reduces code duplication and enhances maintainability. The introduction of ClassifierBatch and RegressorBatch dataclasses for handling data batches is another great change that improves code clarity. The implementation is solid, and the new regressor fine-tuning capability is a valuable addition. I have one suggestion to improve test coverage for the new regressor data pipeline.

@bejaeger bejaeger requested a review from noahho January 5, 2026 17:19
Copy link
Collaborator

@noahho noahho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after taking a look at my comments and deciding if helpful

@bejaeger
Copy link
Contributor Author

bejaeger commented Jan 7, 2026

Thanks a lot for the thorough reviews and useful feedback! I addressed all comments.

@bejaeger bejaeger merged commit 8118b48 into main Jan 7, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants