Skip to content

Conversation

@jonaslandsgesell
Copy link

Motivation and Context

In this PR, we (Jonas Landsgesell and Pascal Knoll) implement the Ranked Probability Score (RPS) for fine-tuning TabPFN.

Ranked Probability Score (RPS) for distributional regression is a distance-sensitive scoring rule—unlike Cross Entropy (CE) loss, which is a local scoring rule. Therefore, RPS has advantages in the regression setting.

We would love to see how a TabPFN model pretrained with the RPS loss performs compared to a TabPFN model pretrained on the Cross Entropy loss. Preliminary evaluation on the regression finetuning example shows benefits of RPS over CE, when run with multiple seeds.

Below you can see our RPS vs. CE evaluation on a custom NN architecture (not TabPFN) on holdout test sets for different datasets:

Better MAE with RPS-trained NN than CE-trained NN:
image

Better R2 with RPS-trained NN than CE-trained NN:
image

Better RPS with RPS-trained NN than CE-trained NN:
image

General Literature:

Public API Changes

  • [ x] No Public API changes
  • Yes, Public API changes (Details below)

How Has This Been Tested?

Yes on the example finetuning script for regression (classification is unchanged)

Checklist

  • [x ] The changes have been tested locally.
  • Documentation has been updated (if the public API or usage changes).
  • A entry has been added to CHANGELOG.md (if relevant for users).
  • The code follows the project's style guidelines.
  • I have considered the impact of these changes on the public API.

🎅🎄

>Co-authored-by: Jonas Landsgesell jonaslandsgesell@gmail.com
>Co-authored-by: Pascal Knoll knollpascal00@gmail.com
@jonaslandsgesell jonaslandsgesell requested a review from a team as a code owner December 19, 2025 16:00
@jonaslandsgesell jonaslandsgesell requested review from noahho and removed request for a team December 19, 2025 16:00
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Jonas Landsgesell seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Ranked Probability Score (RPS) for fine-tuning, which is a valuable enhancement for distributional regression. The implementation of the rps_loss function is clear, well-documented, and correctly integrated into the finetune_regressor.py example. The associated refactoring to use regressor.model_ improves the code structure. I have one minor suggestion to make the one-hot encoding in the loss function more idiomatic. Overall, this is a high-quality contribution.

targets = targets.long()

pred_cdf = torch.cumsum(outputs, dim=1)
target_one_hot = torch.zeros_like(outputs).scatter_(1, targets.unsqueeze(1), 1.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For creating the one-hot encoded target tensor, consider using torch.nn.functional.one_hot. It's more idiomatic for this operation and can make the code's intent clearer. This function is specifically designed for one-hot encoding and might offer better performance.

Suggested change
target_one_hot = torch.zeros_like(outputs).scatter_(1, targets.unsqueeze(1), 1.)
target_one_hot = torch.nn.functional.one_hot(targets, num_classes=outputs.shape[1]).to(outputs.dtype)

@bejaeger
Copy link
Contributor

bejaeger commented Jan 7, 2026

Hi @jonaslandsgesell and Pascal!
Thanks again for the great work here. We took your suggestions and implemented it in a new PR #711 to be compatible with our refactored finetuning code. We also added the ranked logarithmic score as an option.

We’d love to get your eyes on this new version to make sure we’ve captured your intent correctly. Closing this PR now so we can centralize the discussion in #711 !

@bejaeger bejaeger closed this Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants