Text Mining-A Comparative Review of Twitter Sentiments Analysis

被引:0
|
作者
Patil S. [1 ]
Subil D. [1 ]
Nasar N. [1 ]
Kokatnoor S.A. [1 ]
Krishnan B. [1 ]
Kumar S. [1 ]
机构
[1] Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST (Deemed to be University), Karnataka, 74, Bangalore
关键词
airline sentiments; decision trees; gaussian naive bayes; gini index; machine learning; multinomial naive bayes; multinomial naive bayes with bagging; Opinion mining; random forest;
D O I
10.2174/2666255816666230726140726
中图分类号
学科分类号
摘要
Background: Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media net-works. Objective: This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage. Methods: Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis. Results: The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models. Conclusion: Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a mar-ginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with Ada-Boost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.. © 2024 Bentham Science Publishers.
引用
收藏
页码:21 / 37
页数:16
相关论文
共 50 条
  • [31] The Text Mining and Classification Analyses of Tumor Based on Twitter
    Wu, Shianghau
    IETE JOURNAL OF RESEARCH, 2023, 69 (04) : 1945 - 1951
  • [32] The Framework for Political Communication Text Mining Based on Twitter
    Jufri
    Abd Rahman, Aedah Binti
    Suarga
    PROCEEDINGS OF ICORIS 2020: 2020 THE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEM (ICORIS), 2020, : 13 - 18
  • [33] A Review on Text Mining
    Zhang, Yu
    Chen, Mengdong
    Liu, Lianzhong
    PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 681 - 685
  • [34] Video assistant referee on Twitter: a text-mining-based analysis of fan sentiment
    Villarrasa-Sapina, Israel
    Cabezas, Frank Espinosa
    Monfort-Torres, Gonzalo
    RETOS-NUEVAS TENDENCIAS EN EDUCACION FISICA DEPORTE Y RECREACION, 2024, (53): : 91 - 99
  • [35] Twitter Text Mining for Sentiment Analysis on People's Feedback about Oman Tourism
    Ramanathan, Vallikannu
    Meyyappan, T.
    2019 4TH MEC INTERNATIONAL CONFERENCE ON BIG DATA AND SMART CITY (ICBDSC), 2019, : 30 - 34
  • [37] Text Mining Attitudes toward Climate Change: Emotion and Sentiment Analysis of the Twitter Corpus
    Mi, Zhewei
    Zhan, Hongwei
    WEATHER CLIMATE AND SOCIETY, 2023, 15 (02) : 277 - 287
  • [38] Impact of COVID-19: A Text Mining Analysis of Twitter Data in Spanish Language
    Osakwe, Zainab Toteh
    Cortes, Yamnia, I
    HISPANIC HEALTH CARE INTERNATIONAL, 2021, 19 (04) : 239 - 245
  • [39] Public Attitudes and Sentiments toward Common Prosperity in China: A Text Mining Analysis Based on Social Media
    Li, Yang
    Duan, Tianyu
    Zhu, Lijing
    APPLIED SCIENCES-BASEL, 2024, 14 (10):
  • [40] Data mining twitter for COVID-19 sentiments concerning college online education
    Brandon, Daniel
    FUTURE BUSINESS JOURNAL, 2023, 9 (01)