In the data folder, there are two types of datasets, one is named BugFixNoDup_ and BugFixTokenPairs_commits. Can you explain the difference between these two types of datasets? If I want to re-run the experiment for pre-training on BugFix data then which dataset should I be using.
In the data folder, there are two types of datasets, one is named BugFixNoDup_ and BugFixTokenPairs_commits. Can you explain the difference between these two types of datasets? If I want to re-run the experiment for pre-training on BugFix data then which dataset should I be using.