
The Mystery of In-Context Learning of Large Language Models
Module Description
Course | Module Abbreviation | Credit Points |
---|---|---|
Bachelor CL | AS-CL | 8 LP |
Master CL | SS-CL-TAC | 8 LP |
Seminar Informatik | BA + MA | 4 LP |
Anwendungsgebiet Informatik | MA | 8 LP |
Anwendungsgebiet SciComp | MA | 8 LP |
Lecturer | Stefan Riezler |
Module Type | Seminar |
Language | English |
First Session | 15.04.2025 |
Time and Place | Tuesday, 11:15 - 12:45 Mathematikon SR10 |
Commitment Period | tbd. |
Participants
Advanced Bachelor students and all Master students. Students from Computer Science or Scientific computing, especially those with application area Computational Linguistics are welcome.
Prerequisite for Participation
Good knowledge of statistical machine learning and experience in experimental work.
Assessment
- 20%: Regular and active participation (discussion of presented papers during seminar sessions)
- 60%: Oral presentation (30min presentation + 15min discussion, commitment for presentation by April 22, 2025, by email stating 3 ranked preferences for presentation slots)
- 20%: Implementation project and written report (required for 8 LP) or written term paper (required for 4 LP) (5 pages, accompanied by signed declaration of independence of authorship, deadline end of semester)
Content
Large language models (LLMs) have initiated a paradigm shift in machine learning: In contrast to the classic pretraining-then-finetuning paradigm, in order to use LLMs for downstream prediction tasks, one only needs to provide a few demonstrations, known as in-context examples, without updating existing model parameters. This in-context learning (ICL) capability of LLMs is intriguing, and not yet fully understood.
In this seminar, we will discuss several theoretical and empirical approaches to explain this phenomenon. Depending on the required credit points, students will present and critically discuss the papers, and perform implementation projects that investigate the influence of prompting parameters such as ordering, similarity, or structure of in-context examples on prediction performance.