An All-Words Sense Annotated Turkish Corpus

被引:0
|
作者
Akcakaya, Sinan [1 ]
Yildiz, Olcay Taner [1 ]
机构
[1] Isik Univ, Dept Comp Engn, Istanbul, Turkey
来源
2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP) | 2018年
关键词
Natural language processing; Turkish sense annotation; all-words corpus;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reports our efforts in constructing of a sense labeled Turkish corpus with respect to Turkish Language Institution's dictionary, using the traditional method of manual tagging. We tagged a pre-built parallel treebank which is translated from the Penn Treebank II corpus. This approach allowed us to generate a full-coverage resource, in which syntactic and semantic information merged. We also provide miscellaneous statistics about the corpus itself as well as its development process.
引用
收藏
页码:15 / 20
页数:6
相关论文
共 50 条
  • [41] Corpus Linguistics and Linguistically Annotated Corpora
    Xiao-Desai, Yang
    Kuebler, Sandra
    MODERN LANGUAGE JOURNAL, 2015, 99 (04): : 801 - 802
  • [42] Polish Corpus of Annotated Descriptions of Images
    Wroblewska, Alina
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2141 - 2146
  • [43] A Fully Annotated Corpus of Russian Speech
    Skrelin, Pavel
    Volskaya, Nina
    Kocharov, Daniil
    Evgrafova, Karina
    Glotova, Olga
    Evdokimova, Vera
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 109 - 112
  • [44] Corpus Linguistics and Linguistically Annotated Corpora
    Rodriguez-Fuentes, Rodrigo A.
    LANGUAGE LEARNING & TECHNOLOGY, 2015, 19 (03): : 56 - 60
  • [45] TimeBankPT: A TimeML Annotated Corpus of Portuguese
    Costa, Francisco
    Branco, Antonio
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3727 - 3734
  • [46] FactBank: a corpus annotated with event factuality
    Roser Saurí
    James Pustejovsky
    Language Resources and Evaluation, 2009, 43
  • [47] A Morphologically Annotated Corpus of Emirati Arabic
    Khalifa, Salam
    Habash, Nizar
    Eryani, Fadhl
    Obeid, Ossama
    Abdulrahim, Dana
    Al Kaabi, Meera
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3839 - 3846
  • [48] A semantically annotated corpus of tombstone inscriptions
    Johan Bos
    International Journal of Digital Humanities, 2022, 3 (1-3) : 1 - 33
  • [49] A Manually Annotated Corpus of Pharmaceutical Patents
    Kiss, Marton
    Nagy, Agoston
    Vincze, Veronika
    Almasi, Attila
    Alexin, Zoltan
    Csirik, Janos
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 135 - 142
  • [50] ROMBAC: The Romanian Balanced Annotated Corpus
    Ion, Radu
    Irimia, Elena
    Stefanescu, Dan
    Tufis, Dan
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 339 - 344