This repository contains the code and models for our paper:
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg*, Dimitris Tsipras*, Percy Liang, Gregory Valiant
Paper: http://arxiv.org/abs/2208.01066
This repository adapts the Garg et al. setup for our CS 182 project, focusing on linear and quadratic in-context learning. The project code and experiments here were written by Nils Valseth Selte, Dagny Streit, Justin Lee, and Hanna Rod.
@InProceedings{garg2022what,
title={What Can Transformers Learn In-Context? A Case Study of Simple Function Classes},
author={Shivam Garg and Dimitris Tsipras and Percy Liang and Gregory Valiant},
year={2022},
booktitle={arXiv preprint}
}You can start by cloning our repository and following the steps below.
-
Install the dependencies for our code using Conda. You may need to adjust the environment YAML file depending on your setup.
conda env create -f environment.yml conda activate in-context-learning conda install "mkl<2024" -
Download model checkpoints and extract them in the current directory.
wget https://github.com/dtsip/in-context-learning/releases/download/initial/models.zip unzip models.zip -
[Optional] If you plan to train, populate
conf/wandb.yamlwith you wandb info.
That's it! You can now explore our pre-trained models or train your own. The key entry points
are as follows (starting from src):
- The
eval.ipynbnotebook contains code to load our own pre-trained models, plot the pre-computed metrics, and evaluate them on new data. train.pytakes as argument a configuration yaml fromconfand trains the corresponding model. You can tryuv run python train.py --config conf/toy.yamlfor a quick linear run, oruv run python train.py --config conf/toy-quadratic.yamlfor a quadratic toy run.run_sweep.shtrains three GPT-2 sizes on linear regression (conf/linear_sweep/*).run_sweep_curriculum.shtrains linear+quadratic dual curricula (conf/dual_sweep/*).run_dual_eval.pysweeps A/B context lengths for the dual-task checkpoints and writes CSVs tosrc/results/.
If you prefer not to use uv, activate your environment and replace uv run python ... with python ....
We added scripts to probe how transformers transfer between linear and quadratic functions under different curricula.
Training
- Single baseline dual run: from
src/,uv run python train.py --config conf/training_dual_task.yaml(training.problem_type: dual,curriculum_type: random). - Curriculum sweeps:
bash run_sweep_curriculum.shtrains sequential, mixed, and random dual curricula (checkpoints inmodels/dual_*). - Linear-only baselines:
bash run_sweep.shruns the three linear regression configs inconf/linear_sweep/.
Evaluation
- After training, set the run IDs in
src/run_dual_eval.py'smodelsdict and runuv run python run_dual_eval.py. - The script evaluates both orders (linear context → quadratic query, and vice versa) across A/B context lengths, saving mean/SEM CSVs under
src/results/.
