Model run configurations for our 4 tests
$ python data/openwebtext/prepare.py
The following configurations can be ran in parallel
$ torchrun --standalone --nproc_per_node=8 train.py config/train_gpt2.py --kernel_config=0 --out_dir=out-baseline
$ torchrun --standalone --nproc_per_node=8 train.py config/train_gpt2.py --kernel_config=1 --out_dir=out-polynomial
$ torchrun --standalone --nproc_per_node=8 train.py config/train_gpt2.py --kernel_config=2 --out_dir=out-periodic
$ torchrun --standalone --nproc_per_node=8 train.py config/train_gpt2.py --kernel_config=3 --out_dir=out-gaussian
- Download the ARC dataset and Tokenize the ARC Corpus.
$ python data/arc/prepare.py
(Remark: if not properly work, download the ARC dataset from this link and unzip the file atdata/arcfolder. Then rerun the script.) - Rename the folder of each
ckpt.ptto "out-arc-baseline", "out-arc-polynomial", "out-arc-periodic" or "out-arc-gaussian" respectively. - Evaluate the models for each kernel_config:
- Fine-tuning the GPT2 model.
$ python train.py config/finetune_arc.py --init_from=resume --kernel_config=/// - Run evaluation.
$ python eval_arc.py --kernel_config=/// - Evaluate the next model.
- Fine-tuning the GPT2 model.
- Fine-tuning the GPT2 model.
$ python train.py config/finetune_en-fr.py --init_from=/// - Run evaluation.
$ python eval_BLEU.py - Evaluate the next model.