Ruprecht-Karls-Universität Heidelberg

SR3de - Semantic Role Triple Dataset for German

Triple Dataset


Parallel portion of the CoNLL 2009 German data set for the shared task 'Syntactic and Semantic Dependencies in Multiple Languages' , with parallel annotation for the three major semantic role labeling frameworks:
  • PropBank-style (PB)
  • VerbNet-style (VN)
  • FrameNet-style (FN)


To produce the parallel SR3de corpus, you need:
  • from LDC the CoNLL 2009 ST corpus and its original PropBank annotation
  • from the homepage of the SALSA project the corresponding SALSA 2.0 FrameNet-style annotations
  • We directly provide the VN-style annotations produced by the GNVN project.
We provide a skript that takes your copies of the above resources as input, and computes parallel files for the corresponding annotations in the SR3de corpus in CoNLL format, as an enrichment to the original LDC CoNLL representation.


Dataset part predicate argument
predicate types
role types
(PB / VN / FN)
train 2,196 198 10 / 30 / 278
dev 250 121 6 / 23 / 145
test 520 152 8 / 25 / 165

Information on the original corpora

Framework Description of Annotation Original Corpus
PropBank CoNLL 2009 ST LDC
VerbNet GNVN project GNVN data
FrameNet SALSA project SALSA 2.0 corpus

top of page