The annotation of the modal particles in the GeWiss corpus A syntactic and semantic-pragmatic analysis of the PTKMA annotation

被引:0
|
作者
Storo, Sven Robert [1 ]
机构
[1] Tech Nat Wissensch Univ Norwegens, NTNU, Inst Sprache & Literatur, N-7491 Trondheim, Norway
来源
DEUTSCHE SPRACHE | 2022年 / 50卷 / 02期
关键词
Annotation; POS-Tagging; Modalpartikeln; PTKMA; Gesprochene Wissenschaftssprache; GeWiss-Korpus; Prufungsgesprache;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
This article examines the automatic annotation of eight German modal particles in two sub-corpora of the GeWiss Corpus, which originate from the oral examinations of L1 and L2 examinees in a German academic context. Because of the poor reliability of automatic methods for POS tagging of spoken language data, the modal particles were checked manually for correctness using lists of criteria. In addition, the linguistic units ja, eben, halt, einfach, aber, mal, doch and denn which did not have a modal particle annotation, but had been automatically annotated with a different POS tag, were also checked for incorrect annotations, their modal particle properties were examined and the uses of these as modal particles were annotated. The results show that the POS tagging system has a very high error rate of 19,2% in the automatic annotations of the above-mentioned modal particles, and that it annotates the particles with widely varying reliability, ranging from 100% incorrect to 100% correct. Checking the non-PTKMA (modal and modulating particles) types ja, eben, halt, einfach, aber, mal, doch and denn for MP properties shows that several tokens exhibited this property.
引用
收藏
页码:124 / 149
页数:26
相关论文
共 50 条
  • [1] Semantic-Pragmatic Account of Syntactic Structures
    Rygaev, Ivan
    [J]. LOGIC AND ENGINEERING OF NATURAL LANGUAGE SEMANTICS, LENLS 2023, 2024, 14569 : 337 - 352
  • [2] ADESSE. A Database with Syntactic and Semantic Annotation of a Corpus of Spanish
    Vaamonde, Gael
    Gonzalez Dominguez, Fita
    Garcia-Miguel, Jose M.
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1903 - 1910
  • [3] The Pragmatic Annotation of a Corpus of Academic Lectures
    Alsop, Sian
    Nesi, Hilary
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1560 - 1563
  • [4] The semantic/pragmatic annotation of an Air Traffic Control corpus for use in Speech Recognition
    Churcher, GE
    Atwell, ES
    Souter, C
    [J]. CORPUS-BASED STUDIES IN ENGLISH, 1997, (20): : 353 - 373
  • [5] Syntactic Dependencies for Multilingual and Multilevel Corpus Annotation
    Mille, Simon
    Wanner, Leo
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1889 - 1896
  • [6] Semantic annotation of (Czech) corpus texts
    Pala, K
    [J]. TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 56 - 61
  • [7] Semantic annotation of Nouns in Sensem Corpus
    Castellon, Irene
    Lloberes, Marina
    Fisas, Beatriz
    Julia, Albert
    Rigau, German
    Climent, Salvador
    Coll-Florit, Marta
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (45): : 315 - 316
  • [8] Implementing Semantic Annotation in a Ukrainian Corpus
    Starko, Vasyl
    [J]. COLINS 2021: COMPUTATIONAL LINGUISTICS AND INTELLIGENT SYSTEMS, VOL I, 2021, 2870
  • [9] Semantic Annotation of Verbs for the Tatar Corpus
    Galieva, Alfiya
    Nevzorova, Olga
    [J]. PROCEEDINGS OF THE XVII EURALEX INTERNATIONAL CONGRESS: LEXICOGRAPHY AND LINGUISTIC DIVERSITY, 2016, : 340 - 347
  • [10] SynTags - Web Interface for Syntactic and Semantic Annotation
    Atanasov, Atanas
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA (CLIB '16), 2016, : 47 - 53