SR3de - Semantic Role Triple Dataset for German
Triple Dataset
Annotation
Parallel portion of the CoNLL 2009 German data set for the shared task 'Syntactic and Semantic Dependencies in Multiple Languages' , with parallel annotation for the three major semantic role labeling frameworks:- PropBank-style (PB)
- VerbNet-style (VN)
- FrameNet-style (FN)
Format
To produce the parallel SR3de corpus, you need:- from LDC the CoNLL 2009 ST corpus and its original PropBank annotation
- from the homepage of the SALSA project the corresponding SALSA 2.0 FrameNet-style annotations
- We directly provide the VN-style annotations produced by the GNVN project.
- SR3de.zip, containing
- the VerbNet-style semantic role annotations
- the conversion script files
- the readme.txt
Statistics
Dataset part | predicate argument structures |
predicate types (lemma) |
role types (PB / VN / FN) |
---|---|---|---|
train | 2,196 | 198 | 10 / 30 / 278 |
dev | 250 | 121 | 6 / 23 / 145 |
test | 520 | 152 | 8 / 25 / 165 |
Information on the original corpora
Framework | Description of Annotation | Original Corpus |
---|---|---|
PropBank | CoNLL 2009 ST | LDC |
VerbNet | GNVN project | GNVN data |
FrameNet | SALSA project | SALSA 2.0 corpus |