Annotation of a Corpus of Tweets for Sentiment Analysis

被引:1
|
作者
dos Santos, Allisfrank [1 ]
Barros Junior, Jorge Daniel [1 ]
Camargo, Heloisa de Arruda [1 ]
机构
[1] Fed Univ Sao Carlos UFSCar, Dept Comp Sci, Rodovia Washington Luis,Km 235,310-SP, BR-13565905 Sao Carlos, Brazil
关键词
Annotation; Emotion; Tweets; Corpus;
D O I
10.1007/978-3-319-99722-3_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article describes the process of creation and annotation of a tweets corpus for Sentiment Analysis at sentence level. The tweets were captured using the #masterchefbr hashtag, in a tool to acquire the public stream of tweets in real time and then annotated based on the six basic emotions (joy, surprise, fear, sadness, disgust, anger) commonly used in the literature. The neutral tag was adopted to annotate sentences where there was no expressed emotion. At the end of the process, the measure of disagreement between annotators reached a Kappa value of 0.42. Some experiments with the SVM algorithm (Support Vector Machine) have been performed with the objective of submitting the annotated corpus to a classification process, to better understand the Kappa value of the corpus. An accuracy of 52.9% has been obtained in the classification process when using both discordant and concordant text within the corpus.
引用
收藏
页码:294 / 302
页数:9
相关论文
共 50 条
  • [1] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 30
  • [2] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [3] A Review on Corpus Annotation for Arabic Sentiment Analysis
    Almuqren, Latifah
    Alzammam, Arwa
    Alotaibi, Shahad
    Cristea, Alexandra
    Alhumoud, Sarah
    [J]. SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 : 215 - 225
  • [4] Annotation Technique for Health-Related Tweets Sentiment Analysis
    Baccouche, Asma
    Garcia-Zapirain, Begonya
    Elmaghraby, Adel
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 382 - 387
  • [5] SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis
    Guellil, Imane
    Adeel, Ahsan
    Azouaou, Faical
    Hussain, Amir
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2018, 2018, 10989 : 557 - 567
  • [6] Building a Sentiment Corpus of Tweets in Brazilian Portuguese
    Brum, Henrico Bertini
    Volpe Nunes, Maria das Gracas
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4167 - 4172
  • [7] AraSenTi-Tweet: A Corpus for Arabic Sentiment Analysis of Saudi Tweets
    Al-Twairesh, Nora
    Al-Khalifa, Hend
    Al-Salman, AbdulMalik
    Al-Ohali, Yousef
    [J]. ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 63 - 72
  • [8] Sentiment Analysis on Tweets
    Khatoon, Mehjabin
    Banu, W. Aisha
    Zohra, A. Ayesha
    Chinthamani, S.
    [J]. SOFTWARE ENGINEERING (CSI 2015), 2019, 731 : 717 - 724
  • [9] Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries
    Roman, Norton Trevisan
    Piwek, Paul
    Brito Rizzoni Carvalho, Ariadne Maria
    Alvares, Alexandre Rossi
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (04) : 561 - 586
  • [10] Sentiment Analysis in Arabic Tweets
    Duwairi, R. M.
    Marji, Raed
    Sha'ban, Narmeen
    Rushaidat, Sally
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,