When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization

Downlaod the data

All the data is contained is uploaded to a gogole drive folder.

sample_data.pk contains the perturbed first paragraph of Wikipedia biographies that were used for the experiments in the paper.
all_summaries.pk contains the generated summaries for the Wikipedia biographies using all the models that we experimented with.
data_for_plot.pk contains the data needed to generate the plots in our paper.

To load the data, please use the load_data.ipynb notebook. It walks through how to load the data as well as how to compute hallucination rates and create the heatmap in the paper.
If you'd like to generate the plots, please follow the plot.ipynb notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
load_data.ipynb		load_data.ipynb
plot.ipynb		plot.ipynb