In this project we perform supervised learning on the XLM-RoBERTa to train it to do subjectivity detection in news articles in multiple languages and we measure it’s performance relative to other models. The approach consists of training a language separately and training on all languages simultaneously.
The model training code can be found in jupyter notebook files, titled after their subtask. Baseline comparison is in LANG_baseline files
pandas, scikit-learn, torch, transformers, SentencePiece matplotlib