Add TabArena single-table OpenML datasets#369
Add TabArena single-table OpenML datasets#369pc0618 wants to merge 8 commits intosnap-stanford:mainfrom
Conversation
|
CI is green and the PR should be ready for review. Pinging @rishabh-ranjan and @matthiasf. |
|
@JustinGu32 please see that everything looks alright. We should support downloading from server by uploading the zip files. I don't understand what's going on with the folds. I will get Vignesh's opinion on it. |
|
@rishabh-ranjan TabArena is built on OpenML tasks which already have predefined CV resampling splits (folds) instead of timestamp splits. Fold-N names which OpenML train/test split we use. The underlying target is still the same across folds and only the partitioning changes. The timestamps in the task tables are synthetic and only exist to fit the Relbnech interface |
|
Implemented requested refactor on this PR:
On the AutoCompleteTask point: I kept a custom task class because AutoCompleteTask is time-window based, while TabArena uses predefined OpenML split indices; this keeps split semantics exact. Branch is updated ( |
|
Follow-up update pushed to
|
for more information, see https://pre-commit.ci
|
|
||
| TabArena datasets are generated locally (from OpenML) and cached under `~/.cache/relbench/tabarena-*/`. Passing `download=True` will skip the RelBench server download step for these datasets/tasks. | ||
|
|
||
| For an end-to-end PluRel-16B TabArena inference runbook (including `split-*` task naming, random sampling behavior, and `seq_len=2048/4096` commands), see: |
There was a problem hiding this comment.
please remove both these files from examples. you can make a PR to the internal RT repo for these files.
There was a problem hiding this comment.
Done. I removed the internal RT/PluRel runbook from examples and removed the README reference to it. The PR now only contains public-facing example scripts.
| ``` | ||
|
|
||
|
|
||
| **Using TabArena datasets** |
There was a problem hiding this comment.
please rewrite this section to strictly follow the other integration sections (e.g. add citation) in the readme and remove any extra material (e.g. the line about download=True).
There was a problem hiding this comment.
Updated. The TabArena README section now follows the same style as the other integrations: install line, short description, links to public example scripts, and a citation block. I also removed the extra download=True note.
| - task names are `split-N` (not `fold-N`) | ||
| - task tables are edge-free (`fkey_col_to_pkey_table = {}`) | ||
| - no synthetic task timestamps (`time_col=None`) |
There was a problem hiding this comment.
these seem to be internal notes that can be removed
There was a problem hiding this comment.
This seems to be like an internal exploration file with inference experiment details. This can be removed from the public code
| TabArena datasets are generated locally (from OpenML) and cached under `~/.cache/relbench/tabarena-*/`. Passing `download=True` will skip the RelBench server download step for these datasets/tasks. | ||
|
|
||
| For an end-to-end PluRel-16B TabArena inference runbook (including `split-*` task naming, random sampling behavior, and `seq_len=2048/4096` commands), see: | ||
| [`examples/tabarena_plurel16b_inference.md`](examples/tabarena_plurel16b_inference.md) |
There was a problem hiding this comment.
The example file needs to be rewritten based only on publicly available data and code.
There was a problem hiding this comment.
I asked above to simply remove these example files entirely
There was a problem hiding this comment.
We should include examples of how to use these tabular datasets, especially since we have the splits involved.
Reproducing the results with tabpfn on our dataset will help ensure people can trust the data.
There was a problem hiding this comment.
Done. The old internal example was replaced with public examples built only around OpenML data and standard public Python packages. examples/translate_tabarena_to_relbench.py now compares the original OpenML task with the relbenchified records and split-* tables.
There was a problem hiding this comment.
Added. examples/translate_tabarena_to_relbench.py now shows how the source rows map to records, how the RelBench test split matches the OpenML test split, and how RelBench train+val partition the OpenML train side. I also added examples/validate_tabarena_baseline.py, which trains a public baseline on the original OpenML rows and verifies that the relbenchified view produces identical features, labels, predictions, and metrics. I did not add an RT example, in line with the later feedback that this does not belong in RelBench.
|
It would be great to have the following:
|
|
the RT script does not belong here (in RelBench). agree with the other two scripts. |
|
Pushed an update to This addresses the review feedback as follows:
I did not add an RT example, in line with the feedback that this does not belong in RelBench. Validation run locally before pushing:
|
This PR adds TabArena single-table translations to RelBench via a lightweight OpenML-backed integration.
tabarena-<slug>(singlerecordstable).fold-<N>tasks (OpenML CV folds; train/val/test tables).pip install relbench[tabarena](addsopenml).tabarena-*; datasets/tasks are generated locally from OpenML and cached under~/.cache/relbench/tabarena-*/.examples/translate_tabarena_to_relbench.py.Example usage: