Mechanistic Interpretability
Module Description
Course | Module Abbreviation | Credit Points |
---|---|---|
BA-2010[100%|75%] | CS-CL | 6 LP |
BA-2010[50%] | BS-CL | 6 LP |
BA-2010[25%] | BS-AC | 4 LP |
BA-2010 | AS-CL | 8 LP |
Master | SS-CL-TAC | 8 LP |
Lecturer | Frederick Riemenschneider |
Module Type | Proseminar / Hauptseminar |
Language | English |
First Session | 17.10.2024 |
Time and Place | Thursdays, 13:15 - 14:45, INF 329 / SR 26 |
Commitment Period | tbd. |
Prerequisites for Participation
- Completion of Programming I and Introduction to Computational Linguistics or similar introductory courses
- Programming II, Mathematical Foundations of Computational Linguistics and Statistics are heavily suggested
Assessment
- Active participation
- Presentation
- Implementation project
Content
The emergence of super-large language models has largely rendered conventional approaches in NLP obsolete, such as devising tricks to gain a few percentage points in benchmarks. In many cases, it is no longer feasible to fine-tune models using traditional methods. This shift has led to a change in the focus of research in the field of NLP.
One promising area of research is Mechanistic Interpretability, which aims to reverse-engineer the computational mechanisms and representations learned by neural networks into human-understandable algorithms and concepts. This approach provides a granular, causal understanding of how these models operate. By striving for a genuine understanding of the underlying mechanisms, Mechanistic Interpretability offers an exciting avenue for advancing our knowledge of language models.
This seminar will provide an overview of the key papers and ideas in the field of Mechanistic Interpretability, highlighting the progress made thus far and the potential future directions for research.