Shown here is the language distribution of Teuken-7B-v0.4. Next to code Teuken-7B-v0.4 contains approximately 50% non-English text from 23 European countries and around 40% of English pretraining data. SprachverteilungTEUKEN.png image/png Type image/png Dimension Dimension 1440x809 Size File size 95.5 KB Download View full-size image