Methods for Learning without Annotated Data
Module Description
Course | Module Abbreviation | Credit Points |
---|---|---|
BA-2010[100%|75%] | CS-CL | 6 LP |
BA-2010[50%] | BS-CL | 6 LP |
BA-2010 | AS-CL | 8 LP |
Master | SS-CL, SS-TAC | 8 LP |
Lecturer | Letitia Parcalabescu |
Module Type |
|
Language | English |
First Session | 17.04.2024 |
Time and Place |
Wednesday, 15:15–16:45 INF 326 / SR 27 Thursday, 15:15–16:45 INF 325 / SR 7 |
Commitment Period | tbd. |
Participants
All advanced Bachelor students and all Master students. Students from Computer Science, Mathematics or Scientific computing with Anwendungsgebiet Computational Linguistics are welcome.
Prerequisite for Participation
- good knowledge of statistical methods, incl. neural networks
- basic knowledge of linear algebra and calculus
Assessment
- Surpassing 70% of points from exercises to be accepted for the final exam
- Passing the final exam
Description
Machine Learning algorithms (especially in Deep Learning) need large amounts of training data to perform well.
However, high quality manually annotated data is costly and sometimes impossible to collect.
In this course, we want to present an anthology of methods for coping with absent annotation in data.
The course will be organized as a 2h/week lecture and 2h/week tutorial session, where we will discuss general questions and homework assignments. Active participation in the exercises is mandatory for admission to the final exam.
Topics:
- Intro into Tasks, Motivation
- Principal Component Analysis (PCA)
- Clustering (with outliers)
- Vanilla Autoencoders
- Variational Autoencoders (VAEs)
- Generative Adversarial Neural Networks (GANs)
- (Autoregressive) Diffusion models
- State Space Models (e.g., Mamba -- Selective SSM)
- Self-Supervised Learning (SSL) for text
- Interpretability in ML
- Graph Neural Networks (GNNs)
- Adversarial ML