HeiST – Heidelberg Sentiment Treebank
A German dataset for Compositional Sentiment Analysis
HeiST originated in the MA project of Michael Haas (Weakly Supervised Learning for Compositional Sentiment Recognition) as a German counterpart to
the Stanford Sentiment Treebank, and has been constructed in a similar fashion. The textual
basis of HeiST are creative-commons-licensed reviews from the German
movie review site Filmrezensionen.de,
from which we extracted the evaluation summary ("Fazit") sentences.
HeiST comprises 1184 trees where each node has a sentiment label.
The crowdsourcing of HeiST has been supported in part by the Institute of Computational Linguistics and by Yannick Versley's private funds.
The code for the experiments can be found in Michael Haas' github project
For additional bachground, see the following material:
- Michael Haas and Yannick Versley (2015) Subsentential Sentiment on a Shoestring: A Crosslingual Analysis of Compositional Classification. In Proceedings of NAACL-HLT 2015.
- Michael Haas (2015) Weakly Supervised Learning for Compositional Sentiment Recognition.
M.A. Thesis, University of Heidelberg.