-
Notifications
You must be signed in to change notification settings - Fork 538
Add regressor finetuning wrapper #705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
86d4e0d
add regressor finetuning wrapper
bejaeger c62f32d
add abstract base class for finetuning wrappers
bejaeger adf2abd
cleaner handling of batches
bejaeger 11cf7cf
tweaks
bejaeger b2b5fe1
fix tests
bejaeger ab77196
add finetuning regressor tests
bejaeger e4877e1
Merge branch 'main' into ben/add-regressor-finetuning-wrapper
bejaeger 7db6717
make finetuning deterministic
bejaeger b1af62f
better regressor training settings
bejaeger 6c5e159
refactor loss calc
bejaeger 9f9ab18
Merge branch 'ben/add-regressor-finetuning-wrapper' of github.com:Pri…
bejaeger 4170cde
cleanup
bejaeger 3adf058
comment update
bejaeger 0d8e651
add test
bejaeger ac003b6
revision
bejaeger 1c9fc95
revision 2
bejaeger 4d19626
add comment
bejaeger 3304175
remove configurations for finetuning examples
bejaeger 6289d08
add back one parameter for better performance on example
bejaeger 55fcc15
minor tweaks
bejaeger 382e41a
add regressor finetuning wrapper
bejaeger 8c0e186
add abstract base class for finetuning wrappers
bejaeger e5b72cc
cleaner handling of batches
bejaeger aab7f15
tweaks
bejaeger 6712c1a
fix tests
bejaeger c3b644c
add finetuning regressor tests
bejaeger a2f80c8
refactor loss calc
bejaeger 6403ae9
make finetuning deterministic
bejaeger 81ec221
better regressor training settings
bejaeger b473d58
cleanup
bejaeger 0da205b
comment update
bejaeger 5b4a9f1
add test
bejaeger 4cd088b
revision
bejaeger 6ae3420
revision 2
bejaeger 6991cc0
add comment
bejaeger cb18bc7
remove configurations for finetuning examples
bejaeger 1cbfb6f
add back one parameter for better performance on example
bejaeger c699c3c
minor tweaks
bejaeger c798691
Merge branch 'main' into ben/add-regressor-finetuning-wrapper
bejaeger 266b8b7
Merge branch 'ben/add-regressor-finetuning-wrapper' of github.com:Pri…
bejaeger 6027e00
remove old file that sneaked in
bejaeger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,118 @@ | ||
| """Example of fine-tuning a TabPFN regressor using the FinetunedTabPFNRegressor wrapper. | ||
|
|
||
| Note: We recommend running the fine-tuning scripts on a CUDA-enabled GPU, as full | ||
| support for the Apple Silicon (MPS) backend is still under development. | ||
| """ | ||
|
|
||
| import logging | ||
| import warnings | ||
|
|
||
| import sklearn.datasets | ||
| import torch | ||
| from sklearn.metrics import mean_squared_error, r2_score | ||
| from sklearn.model_selection import train_test_split | ||
|
|
||
| from tabpfn import TabPFNRegressor | ||
| from tabpfn.finetuning.finetuned_regressor import FinetunedTabPFNRegressor | ||
|
|
||
| warnings.filterwarnings( | ||
| "ignore", | ||
| category=FutureWarning, | ||
| module=r"google\.api_core\._python_version_support", | ||
| ) | ||
|
|
||
| logging.basicConfig( | ||
| level=logging.INFO, | ||
| format="%(asctime)s - %(levelname)s - %(message)s", | ||
| ) | ||
|
|
||
| # ============================================================================= | ||
| # Fine-tuning Configuration | ||
| # For details and more options see FinetunedTabPFNRegressor | ||
| # | ||
| # These settings work well for the California Housing dataset. | ||
| # For other datasets, you may need to adjust these settings to get good results. | ||
| # ============================================================================= | ||
|
|
||
| # Training hyperparameters | ||
| NUM_EPOCHS = 30 | ||
| LEARNING_RATE = 1e-5 | ||
|
|
||
| # We can fine-tune using almost the entire housing dataset | ||
| # in the context of the train batches. | ||
| N_FINETUNE_CTX_PLUS_QUERY_SAMPLES = 20_000 | ||
|
|
||
| # Ensemble configuration | ||
| # number of estimators to use during finetuning | ||
| NUM_ESTIMATORS_FINETUNE = 8 | ||
| # number of estimators to use during train time validation | ||
| NUM_ESTIMATORS_VALIDATION = 8 | ||
| # number of estimators to use during final inference | ||
| NUM_ESTIMATORS_FINAL_INFERENCE = 8 | ||
|
|
||
| # Reproducibility | ||
| RANDOM_STATE = 0 | ||
|
|
||
|
|
||
| def main() -> None: | ||
| data = sklearn.datasets.fetch_california_housing(as_frame=True) | ||
| X_all = data.data | ||
| y_all = data.target | ||
|
|
||
| X_train, X_test, y_train, y_test = train_test_split( | ||
| X_all, y_all, test_size=0.1, random_state=RANDOM_STATE | ||
| ) | ||
|
|
||
| print( | ||
| f"Loaded {len(X_train):,} samples for training and " | ||
| f"{len(X_test):,} samples for testing." | ||
| ) | ||
|
|
||
| # 2. Initial model evaluation on test set | ||
| base_reg = TabPFNRegressor( | ||
| device=[f"cuda:{i}" for i in range(torch.cuda.device_count())], | ||
| n_estimators=NUM_ESTIMATORS_FINAL_INFERENCE, | ||
| ignore_pretraining_limits=True, | ||
| inference_config={"SUBSAMPLE_SAMPLES": 50_000}, | ||
| ) | ||
| base_reg.fit(X_train, y_train) | ||
|
|
||
| base_pred = base_reg.predict(X_test) | ||
| mse = mean_squared_error(y_test, base_pred) | ||
| r2 = r2_score(y_test, base_pred) | ||
|
|
||
| print(f"📊 Default TabPFN Test MSE: {mse:.4f}") | ||
| print(f"📊 Default TabPFN Test R²: {r2:.4f}\n") | ||
|
|
||
| # 3. Initialize and run fine-tuning | ||
| print("--- 2. Initializing and Fitting Model ---\n") | ||
|
|
||
| # Instantiate the wrapper with your desired hyperparameters | ||
| finetuned_reg = FinetunedTabPFNRegressor( | ||
| device="cuda" if torch.cuda.is_available() else "cpu", | ||
| epochs=NUM_EPOCHS, | ||
| learning_rate=LEARNING_RATE, | ||
| random_state=RANDOM_STATE, | ||
| n_finetune_ctx_plus_query_samples=N_FINETUNE_CTX_PLUS_QUERY_SAMPLES, | ||
| n_estimators_finetune=NUM_ESTIMATORS_FINETUNE, | ||
| n_estimators_validation=NUM_ESTIMATORS_VALIDATION, | ||
| n_estimators_final_inference=NUM_ESTIMATORS_FINAL_INFERENCE, | ||
| ) | ||
|
|
||
| # 4. Call .fit() to start the fine-tuning process on the training data | ||
| finetuned_reg.fit(X_train.values, y_train.values) | ||
| print("\n") | ||
|
|
||
| # 5. Evaluate the fine-tuned model | ||
| print("--- 3. Evaluating Model on Held-out Test Set ---\n") | ||
| y_pred = finetuned_reg.predict(X_test.values) | ||
|
|
||
| mse = mean_squared_error(y_test, y_pred) | ||
| r2 = r2_score(y_test, y_pred) | ||
|
|
||
| print(f"📊 Finetuned TabPFN Test MSE: {mse:.4f}") | ||
| print(f"📊 Finetuned TabPFN Test R²: {r2:.4f}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| """Single-dataset fine-tuning wrappers for TabPFN models.""" | ||
|
|
||
| from tabpfn.finetuning.data_util import ClassifierBatch, RegressorBatch | ||
| from tabpfn.finetuning.finetuned_base import EvalResult, FinetunedTabPFNBase | ||
| from tabpfn.finetuning.finetuned_classifier import FinetunedTabPFNClassifier | ||
| from tabpfn.finetuning.finetuned_regressor import FinetunedTabPFNRegressor | ||
|
|
||
| __all__ = [ | ||
| "ClassifierBatch", | ||
| "EvalResult", | ||
| "FinetunedTabPFNBase", | ||
| "FinetunedTabPFNClassifier", | ||
| "FinetunedTabPFNRegressor", | ||
| "RegressorBatch", | ||
| ] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.