Evaluating Multilingual Large Language Models Using Linguistic Variations In Multilingual Learners’ Writing: A Teacher Study
🚩 Accepted by ISLS 2025!
Kaycie Barron1 | Nora Tseng1 | Shamya Karumbaiah1 | Cynthia Baeza1 |
1University of Wisconsin-Madison
This paper investigates teachers’ perceptions on linguistic variations in bi/multilingual learners’ (MLs) writing to evaluate the (in)effectiveness of Multilingual Large Language Models (MLLMs), which are artificial intelligence (AI) models that generate texts in multiple languages. Due to their inherent linguistic biases, these models often struggle to interpret MLs’ linguistic variations. To address this gap, we elicit teacher feedback on prevalent linguistic variations in MLs’ writing and assess how Meta Llama 3.1, a state-of-the-art MLLM, responds to these variations. Using translanguaging as a lens—the fluid use of multiple languages to convey meaning across social contexts—we propose a new approach to evaluate MLLMs in multilingual learning contexts. With the increasing prevalence of AI in K12 classrooms, this paper advocates for the inclusion of bi/multilingual educators to better align the use of AI with progressive pedagogies such as translanguaging.
Synthetically generated variations (n=200) available in eval_datasets.zip
Original datasets from previous work