Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Lehrveranstaltungen
heiCO
Ressourcen	Fachschaft
Studien-FAQ	Technik-FAQ

Mechanistic Interpretability

Module Description

Course	Module Abbreviation	Credit Points
BA-2010[100%\|75%]	CS-CL	6 LP
BA-2010[50%]	BS-CL	6 LP
BA-2010[25%]	BS-AC	4 LP
BA-2010	AS-CL	8 LP
Master	SS-CL-TAC	8 LP

Lecturer	Frederick Riemenschneider
Module Type	Proseminar / Hauptseminar
Language	English
First Session	17.10.2024
Time and Place	Thursdays, 13:15 - 14:45, INF 329 / SR 26
Commitment Period	tbd.

Prerequisites for Participation

Completion of Programming I and Introduction to Computational Linguistics or similar introductory courses
Programming II, Mathematical Foundations of Computational Linguistics and Statistics are heavily suggested

Assessment

Active participation
Presentation
Implementation project

Content

The emergence of super-large language models has largely rendered conventional approaches in NLP obsolete, such as devising tricks to gain a few percentage points in benchmarks. In many cases, it is no longer feasible to fine-tune models using traditional methods. This shift has led to a change in the focus of research in the field of NLP.

One promising area of research is Mechanistic Interpretability, which aims to reverse-engineer the computational mechanisms and representations learned by neural networks into human-understandable algorithms and concepts. This approach provides a granular, causal understanding of how these models operate. By striving for a genuine understanding of the underlying mechanisms, Mechanistic Interpretability offers an exciting avenue for advancing our knowledge of language models.

This seminar will provide an overview of the key papers and ideas in the field of Mechanistic Interpretability, highlighting the progress made thus far and the potential future directions for research.

Mechanistic Interpretability

Module Description

Prerequisites for Participation

Assessment

Content

» More Materials