Text sources in general #280

ximenina · 2023-02-07T17:57:09Z

ximenina
Feb 7, 2023
Maintainer

Please list here suggestions for corpora sources that might be useful for the TeDDi Sample.

ximenina · 2023-02-07T18:09:23Z

I list here some sources suggested by colleagues:

From Steve: https://odin.linguistlist.org/ (Although the download function doesn't work at the moment. I did email them to fix it)

CommonVoice Hundreds of languages, many low-resourced

https://www.openslr.org/79/ Kannada critically endangered

https://www.openslr.org/126/ Kannada critically endangered

From Chris: Check the 1000Langs corpus of parallel bible texts for overlap with our sample of languages:

0 replies