An All-Words Sense Annotated Turkish Corpus

被引:0
|
作者
Akcakaya, Sinan [1 ]
Yildiz, Olcay Taner [1 ]
机构
[1] Isik Univ, Dept Comp Engn, Istanbul, Turkey
来源
2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP) | 2018年
关键词
Natural language processing; Turkish sense annotation; all-words corpus;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reports our efforts in constructing of a sense labeled Turkish corpus with respect to Turkish Language Institution's dictionary, using the traditional method of manual tagging. We tagged a pre-built parallel treebank which is translated from the Penn Treebank II corpus. This approach allowed us to generate a full-coverage resource, in which syntactic and semantic information merged. We also provide miscellaneous statistics about the corpus itself as well as its development process.
引用
收藏
页码:15 / 20
页数:6
相关论文
共 50 条
  • [1] A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation
    Saeed, Ali
    Nawab, Rao Muhammad Adeel
    Stevenson, Mark
    Rayson, Paul
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [2] All-Words Word Sense Disambiguation for Turkish
    Acikgoz, Onur
    Gurkan, Ali Tunca
    Ertopcu, Burak
    Topsakal, Ozan
    Ozenc, Berke
    Kanburoglu, Ali Bugra
    Cam, Ilker
    Avar, Begum
    Ercan, Gokhan
    Yildiz, Olcay Taner
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 490 - 495
  • [3] An all-words sense tagging method for resource-deficient languages
    Yi, Bong-Jun
    Lee, Do-Gil
    Rim, Hae-Chang
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2017, 32 (03) : 672 - 688
  • [4] LEXSEMTM: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning
    Bennett, Andrew
    Baldwin, Timothy
    Lau, Jey Han
    McCarthy, Diana
    Bond, Francis
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1513 - 1524
  • [5] Spanish all-words semantic class disambiguation using Cast3LB corpus
    Izquierdo-Bevia, Ruben
    Moreno-Monteagudo, Lorenza
    Navarro, Borja
    Suarez, Armando
    MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 879 - +
  • [6] All-words Word Sense Disambiguation for Russian Using Automatically Generated Text Collection
    Angelina, Bolshina
    Loukachevitch, Natalia
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2020, 20 (04) : 90 - 107
  • [7] A Multilayer Annotated Corpus for Turkish
    Yildiz, Olcay Taner
    Ak, Koray
    Ercan, Gokhan
    Topsakal, Ozan
    Asmazoglu, Cengiz
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 21 - 26
  • [8] Sense Annotated Hindi Corpus
    Singh, Satyendr
    Siddiqui, Tanveer J.
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 22 - 25
  • [9] Explorations in lexical sample and all-words lexical substitution
    Sinha, Ravi
    Mihalcea, Rada
    NATURAL LANGUAGE ENGINEERING, 2014, 20 (01) : 99 - 129
  • [10] Evaluation of all-words WSD for Chinese in machine translation
    Wang, Bo
    Yang, Mu-Yun
    Li, Sheng
    Zhao, Tie-Jun
    Zidonghua Xuebao/Acta Automatica Sinica, 2008, 34 (05): : 535 - 541