An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level

被引:3
|
作者
Omurca, Sevinc Ihan [1 ]
Ekinci, Ekin
Turkmen, Hazal
机构
[1] Fac Engn, Dept Comp Engn, TR-41380 Kocaeli, Turkey
关键词
Aspect based sentiment analysis; Turkish Language; text mining; morphological analysis; annotation; !text type='JSON']JSON[!/text] data;
D O I
10.1109/IDAP.2017.8090212
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid growth of unstructured data accessible via web, managing these data and finding undiscovered information in huge dataset become a necessary task. Consequently text mining, which can be defined as gleaning important information from natural language text, has emerged. In this study, in order to facilitate information management for aspect based sentiment analysis studies, a Turkish sentiment corpus, which is comprised of user reviews and is annotated semi-automatically, is constructed. In the constructed corpus, the root form of the words, the usage (aspect/multiaspect/seedsentiment/absent) of these words, Part of Speech (POS) tags and their polarities are defined. Turkish hotel review dataset which contains 1000 reviews and 5364 sentences for this study was crawled from a web source. The system takes reviews, aspect and seedsentiment lists and returns JSON data structures of the annotated corpus. In this paper, both we provide a ready to use dataset for developing aspect based sentiment analysis applications and we make this dataset easy to use for Java applications by creating JSON data.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A Hungarian Sentiment Corpus Manually Annotated at Aspect Level
    Szabo, Martina Katalin
    Vincze, Veronika
    Simko, Katalin
    Varga, Viktor
    Hangya, Viktor
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2873 - 2878
  • [2] Annotated Corpus for Sentiment Analysis in Odia Language
    Mohanty, Gaurav
    Mishra, Pruthwik
    Mamidi, Radhika
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2788 - 2795
  • [3] A Multilayer Annotated Corpus for Turkish
    Yildiz, Olcay Taner
    Ak, Koray
    Ercan, Gokhan
    Topsakal, Ozan
    Asmazoglu, Cengiz
    [J]. 2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 21 - 26
  • [4] Sentence-Level Sentiment Analysis in Persian
    Basiri, Mohammad Ehsan
    Kabiri, Arman
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS (IPRIA), 2017, : 84 - 89
  • [5] Sentence Sentiment Analysis Based on International Chinese Education Dynamic Corpus
    Wang, Jing
    Yang, Li-jiao
    Jiang, Hong-fei
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 626 - 630
  • [6] AlgBERT: Automatic Construction of Annotated Corpus for Sentiment Analysis in Algerian Dialect
    Hamadouche, Khaoula
    Bousmaha, Kheira Zineb
    Bekkoucha, Mohamed Abdelwaret
    Hadrich-Belguith, Lamia
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (12)
  • [7] A hybrid approach to the sentiment analysis problem at the sentence level
    Appel, Orestes
    Chiclana, Francisco
    Carter, Jenny
    Fujita, Hamido
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 108 : 110 - 124
  • [8] Sentiment Analysis of Tweets at Sentence Level Using Hadoop
    Paul, Yazala Ritika Siril
    Borikar, Dilipkumar A.
    [J]. HELIX, 2018, 8 (05): : 3797 - 3801
  • [9] Sentence-Level Sentiment Analysis in the Presence of Modalities
    Liu, Yang
    Yu, Xiaohui
    Liu, Bing
    Chen, Zhongshuai
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2014, PART II, 2014, 8404 : 1 - 16
  • [10] Constructing Corpus of Scientific Abstracts Annotated with Sentence Roles
    Yamamoto, Takafumi
    Tomiura, Yoichi
    [J]. PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016, 2016, : 159 - 162