Text Mining-A Comparative Review of Twitter Sentiments Analysis

被引:0
|
作者
Patil S. [1 ]
Subil D. [1 ]
Nasar N. [1 ]
Kokatnoor S.A. [1 ]
Krishnan B. [1 ]
Kumar S. [1 ]
机构
[1] Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST (Deemed to be University), Karnataka, 74, Bangalore
关键词
airline sentiments; decision trees; gaussian naive bayes; gini index; machine learning; multinomial naive bayes; multinomial naive bayes with bagging; Opinion mining; random forest;
D O I
10.2174/2666255816666230726140726
中图分类号
学科分类号
摘要
Background: Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media net-works. Objective: This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage. Methods: Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis. Results: The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models. Conclusion: Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a mar-ginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with Ada-Boost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.. © 2024 Bentham Science Publishers.
引用
收藏
页码:21 / 37
页数:16
相关论文
共 50 条
  • [1] Metaverse-related perceptions and sentiments on Twitter: evidence from text mining and network analysis
    Guenduez, Ugur
    Demirel, Sadettin
    ELECTRONIC COMMERCE RESEARCH, 2023,
  • [2] Sentiments Analysis Of Twitter Data Using Data Mining
    Jain, Anurag P.
    Katkar, Vijay D.
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (ICIP), 2015, : 807 - 810
  • [3] A text mining analysis of human flourishing on Twitter
    Manuel Cebral-Loureda
    Alberto Hernández-Baqueiro
    Enrique Tamés-Muñoz
    Scientific Reports, 13
  • [4] A text mining analysis of human flourishing on Twitter
    Cebral-Loureda, Manuel
    Hernandez-Baqueiro, Alberto
    Tames-Munoz, Enrique
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [5] Text Analysis of Evolving Emotions and Sentiments in COVID-19 Twitter Communication
    Storey, Veda C.
    O'Leary, Daniel E.
    COGNITIVE COMPUTATION, 2024, 16 (04) : 1834 - 1857
  • [6] Text Mining and Determinants of Sentiments towards the COVID-19 Vaccine Booster of Twitter Users in Malaysia
    Ong, Song-Quan
    Pauzi, Maisarah Binti Mohamed
    Gan, Keng Hoon
    HEALTHCARE, 2022, 10 (06)
  • [7] Twitter and Research: A Systematic Literature Review Through Text Mining
    Karami, Amir
    Lundy, Morgan
    Webb, Frank
    Dwivedi, Yogesh K.
    IEEE ACCESS, 2020, 8 (08): : 67698 - 67717
  • [8] Anomaly Detection Techniques in Data Mining-A Review
    Lakshmi, K. N.
    Neema, N.
    Muddasir, N. Mohammed
    Prashanth, M., V
    INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 799 - 804
  • [9] TWITTER AS A POLITICAL TOOL IN EU COUNTRIES DURING THE ECONOMIC CRISIS: A COMPARATIVE TEXT-MINING ANALYSIS
    Redek, Tjasa
    Godnov, Uros
    DRUSTVENA ISTRAZIVANJA, 2018, 27 (04): : 691 - 711
  • [10] A Review on Social Audience Identification on Twitter using Text mining methods
    Dastanwala, Priyanka B.
    Patel, Vibha
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 1917 - 1920