Shown here is the language distribution of Teuken-7B-v0.4. Next to code Teuken-7B-v0.4 contains approximately 50% non-English text from 23 European countries and around 40% of English pretraining data. SprachverteilungTEUKEN.webp image/webp Type image/webp Dimension Dimension 1440x809 Size File size 26.2 KB Download View full-size image