A semiautomatic annotation approach for sentiment analysis

被引:10
|
作者
Alahmary, Rahma [1 ,2 ]
Al-Dossari, Hmood [1 ]
机构
[1] King Saud Univ, Informat Syst Dept, POB 145111, Riyadh 4545, Saudi Arabia
[2] Al Imam Mohammad Ibn Saud Islamic Univ, Informat Syst Dept, Riyadh, Saudi Arabia
关键词
Annotation; deep learning; machine learning; Saudi dialect; sentiment analysis; OPINION;
D O I
10.1177/01655515211006594
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis (SA) aims to extract users' opinions automatically from their posts and comments. Almost all prior works have used machine learning algorithms. Recently, SA research has shown promising performance in using the deep learning approach. However, deep learning is greedy and requires large datasets to learn, so it takes more time for data annotation. In this research, we proposed a semiautomatic approach using Naive Bayes (NB) to annotate a new dataset in order to reduce the human effort and time spent on the annotation process. We created a dataset for the purpose of training and testing the classifier by collecting Saudi dialect tweets. The dataset produced from the semiautomatic model was then used to train and test deep learning classifiers to perform Saudi dialect SA. The accuracy achieved by the NB classifier was 83%. The trained semiautomatic model was used to annotate the new dataset before it was fed into the deep learning classifiers. The three deep learning classifiers tested in this research were convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM). Support vector machine (SVM) was used as the baseline for comparison. Overall, the performance of the deep learning classifiers exceeded that of SVM. The results showed that CNN reported the highest performance. On one hand, the performance of Bi-LSTM was higher than that of LSTM and SVM, and, on the other hand, the performance of LSTM was higher than that of SVM. The proposed semiautomatic annotation approach is usable and promising to increase speed and save time and effort in the annotation process.
引用
收藏
页码:398 / 410
页数:13
相关论文
共 50 条
  • [1] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302
  • [2] Automatic Sentiment Annotation of Idiomatic Expressions for Sentiment Analysis Task
    Tahayna, Bashar M. A.
    Ayyasamy, Ramesh Kumar
    Akbar, Rehan
    [J]. IEEE ACCESS, 2022, 10 : 122234 - 122242
  • [3] Cross-modal dynamic sentiment annotation for speech sentiment analysis
    Chen, Jincai
    Sun, Chao
    Zhang, Sheng
    Zeng, Jiangfeng
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
  • [4] A Review on Corpus Annotation for Arabic Sentiment Analysis
    Almuqren, Latifah
    Alzammam, Arwa
    Alotaibi, Shahad
    Cristea, Alexandra
    Alhumoud, Sarah
    [J]. SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 : 215 - 225
  • [5] Semiautomatic Annotation of MOOC Forum Posts
    Liu, Weizhe
    Kidzinski, Lukasz
    Dillenbourg, Pierre
    [J]. STATE-OF-THE-ART AND FUTURE DIRECTIONS OF SMART LEARNING, 2016, : 399 - 408
  • [6] SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis
    Guellil, Imane
    Adeel, Ahsan
    Azouaou, Faical
    Hussain, Amir
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2018, 2018, 10989 : 557 - 567
  • [7] Sentence Annotation for Aspect-oriented Sentiment Analysis: A Lexicon based Approach with Marathi Movie Reviews
    Mhaske, N.T.
    Patil, A.S.
    [J]. Journal of The Institution of Engineers (India): Series B, 2024, 105 (06) : 1669 - 1677
  • [8] Annotation Technique for Health-Related Tweets Sentiment Analysis
    Baccouche, Asma
    Garcia-Zapirain, Begonya
    Elmaghraby, Adel
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 382 - 387
  • [9] Construction of a Phonotactic Dialect Corpus using Semiautomatic Annotation
    Schwartz, Reva
    Shen, Wade
    Campbell, Joseph
    Paget, Shelley
    Vonwiller, Julie
    Estival, Dominique
    Cieri, Christopher
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2472 - +
  • [10] Sentiment analysis: A combined approach
    Prabowo, Rudy
    Thelwall, Mike
    [J]. JOURNAL OF INFORMETRICS, 2009, 3 (02) : 143 - 157