Natural Language Processing and Sentiment Analysis on Bangla Social Media Comments on Russia-Ukraine War Using Transformers

被引:8
|
作者
Hasan, Mahmud [1 ]
Islam, Labiba [1 ]
Jahan, Ismat [1 ]
Meem, Sabrina Mannan [1 ]
Rahman, Rashedur M. [1 ]
机构
[1] North South Univ, Dept Elect & Comp Engn, Dhaka 1229, Bangladesh
关键词
Natural language processing; sentiment analysis; transformers; Russia-Ukraine war;
D O I
10.1142/S2196888823500021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine-Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.
引用
收藏
页码:329 / 356
页数:28
相关论文
共 50 条
  • [41] Social media insights on sepsis management using advanced natural language processing techniques
    Ravi Shankar
    Amartya Mukhopadhyay
    Critical Care, 29 (1):
  • [42] Crime Detection and Analysis from Social Media Messages Using Machine Learning and Natural Language Processing Technique
    Lombo, Xolani
    Oyelade, Olaide N.
    Ezugwu, Absalom E.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2022 WORKSHOPS, PART V, 2022, 13381 : 502 - 517
  • [43] Analysis and Comparison of Natural Language Processing Algorithms as Applied to Bitcoin Conversations on Social Media
    McMillan, Benjamin
    Myers, Joshua
    Nguyen, An
    Robinson, Don
    Kennard, Mark
    JOURNAL OF INVESTING, 2022, 31 (02): : 38 - 59
  • [44] An exploratory content and sentiment analysis of the guardian metaverse articles using leximancer and natural language processing
    Tunca, Sezai
    Sezen, Bulent
    Wilk, Violetta
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [45] An exploratory content and sentiment analysis of the guardian metaverse articles using leximancer and natural language processing
    Sezai Tunca
    Bulent Sezen
    Violetta Wilk
    Journal of Big Data, 10
  • [46] Sentiment analysis of twitter data to detect and predict political leniency using natural language processing
    Kowsik, V. V. Sai
    Yashwanth, L.
    Harish, Srivatsan
    Kishore, A.
    Renji, S.
    Jose, Arun Cyril
    Dhanyamol, M., V
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (03) : 765 - 785
  • [47] Advanced Natural Language Processing Analysis on Cross-Border Media Sentiment from China and South Korea
    Kim, Jinhyoung
    Kim, Wonseong
    INTERNATIONAL AREA STUDIES REVIEW, 2024, 27 (01) : 43 - 56
  • [48] Improving Sentiment Analysis for Social Media Applications Using an Ensemble Deep Learning Language Model
    Alsayat, Ahmed
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (02) : 2499 - 2511
  • [49] Improving Sentiment Analysis for Social Media Applications Using an Ensemble Deep Learning Language Model
    Ahmed Alsayat
    Arabian Journal for Science and Engineering, 2022, 47 : 2499 - 2511
  • [50] Understanding Public Sentiment toward I-710 Corridor Project from Social Media Based on Natural Language Processing
    Hao, Liyang
    Panangadan, Anand
    Abellera, Lourdes V.
    2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2016, : 2107 - 2112