Multilingual Language Models
Kursbeschreibung
Studiengang | Modulkürzel | Leistungs- bewertung |
---|---|---|
BA-2010 | AS-CL, AS-FL | 8 LP |
BA-2010[100%|75%] | CS-CL | 6 LP |
BA-2010[50%] | BS-CL | 6 LP |
BA-2010[25%] | BS-AC, BS-FL | 4 LP |
Master | SS-CL, SS-TAC, SS-FAL | 8 LP |
Dozenten/-innen | Frederick Riemenschneider |
Veranstaltungsart |
|
Sprache | English |
Erster Termin | 19.10.2023 |
Zeit und Ort | Donnerstags, 13:15-14:45, INF 325 / SR 24 |
Commitment-Frist | tbd. |
Teilnahmevoraussetzungen
Leistungsnachweis
- Active participation
- Presentation
- Implementation project
Inhalt
By now, transformer-based language models have been pre-trained on a wide variety of languages, including high-resource languages as well as lower-resourced languages. Despite the growing inclusivity, monolingual pre-training on low-resource languages fails to yield significant benefits due to the enormous text corpus required, creating a gap in natural language processing advancements for these languages. To address this problem, numerous multilingual language models have been proposed, such as mBERT, XLM, or XLM-R. These models are trained in the hope that they will acquire generalizable knowledge from high-resource languages, which can then be transferred to lower-resource languages.
However, multilingual pre-training introduces a new complication known as the "curse of multilinguality", which refers to the capacity dilution as the model's per-language capacity diminishes with an increasing number of languages. This seminar takes an in-depth look into various multilingual models and their pre-training objectives. We will also discuss the challenges presented by the "curse of multilinguality", presenting analyses and potential solutions to lift this curse.
Literatur
Will be announced at the beginning of the semester.