Ruprecht-Karls-Universität Heidelberg
Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg
Siegel der Uni Heidelberg

Word Sense Disambiguation

Kursbeschreibung

DozentInnen Simone Paolo Ponzetto
Veranstaltungsart Hauptseminar
Zeit und Ort Di, 09:15 - 10:45, INF 325 / SR 24 (SR)
Studiengang MA, Magister
Leistungsbewertung MA: 8 LP

Teilnahmevoraussetzungen

Voraussetzungen sind die bestandene Zwischenprüfung (Magister) und Programmierprüfung. Vorkenntnisse in statistischer NLP oder Maschinellem Lernen sind von Vorteil.

Vollständige Lektüre von Navigli (2009) (siehe unten)

Leistungsnachweis

Aktive Teilnahme und regelmäßige Abgabe von Projektarbeit in kleinen Gruppen. Vortrag/Präsentation.

Inhalt

Word Sense Disambiguation (WSD) is the problem of identifying the intended meaning (or sense) of a word, based on the context in which it occurs. Correctly identifying the senses of words in context is a central problem for Natural Language Processing (NLP), and robust performance on this task is accordingly expected to provide crucial lexical semantic information for many NLP applications such as machine translation, information retrieval, etc.

This seminar will provide a gentle introduction to state-of-the-art approaches in WSD. These include:

  • knowledge-based methods that either (a) make use of dictionaries and thesauri and/or (b) manually crafted graph-like resources such as e.g. WordNet or GermaNet;
  • supervised machine learning methods that learn classifiers from sense annotated data;
  • minimally supervised methods (aka bootstrapping) that, starting with a small amounts of labeled data (seeds), iteratively harvest new sense annotations to improve the sense disambiguation accuracy.

Students will present current work from the literature in short, seminar-format presentations (Referate). In addition, every 4-6 weeks they will be expected to form small groups of 3-4 people and work on a project, e.g. implement and/or extend an existing state-of-the-art WSD approach. Each of the groups is expected to submit a short report (2-4 pages), as well as to give a short project-overview presentation at the end of each round. Students are expected to *actively* participate in the class discussions during their fellow students' presentations, as well as in the seminar's projects. This means that you'll have to read the papers before the class period in which they will be presented and discussed, as well as clearly present to the audience what your specific work was as part of the seminar's projects.

Determination of final grade:

  33%: presentation
  33%: participation in the seminar's projects
  33%: participation in the class discussions

Kursübersicht

Seminarplan

Datum Sitzung Materialien

» weitere Kursmaterialien

Literatur

  • R. Navigli. Word Sense Disambiguation: a Survey, ACM Computing Surveys, 41(2), ACM Press, 2009, pp. 1-69 (WICHTIG: Lektüre zu Semesterbeginn vorausgesetzt!); Link
  • Eneko Agirre & Philip Edmonds (eds.) Word Sense Disambiguation Algorithms and Applications, Springer, 2006 (wird als Referenz benutzt) Link

» weitere Kursmaterialien