Shown here is the language distribution of Teuken-7B-v0.4. Next to code Teuken-7B-v0.4 contains approximately 50% non-English text from 23 European countries and around 40% of English pretraining data. Home