Text Mining-A Comparative Review of Twitter Sentiments Analysis

被引:0
|
作者
Patil S. [1 ]
Subil D. [1 ]
Nasar N. [1 ]
Kokatnoor S.A. [1 ]
Krishnan B. [1 ]
Kumar S. [1 ]
机构
[1] Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST (Deemed to be University), Karnataka, 74, Bangalore
关键词
airline sentiments; decision trees; gaussian naive bayes; gini index; machine learning; multinomial naive bayes; multinomial naive bayes with bagging; Opinion mining; random forest;
D O I
10.2174/2666255816666230726140726
中图分类号
学科分类号
摘要
Background: Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media net-works. Objective: This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage. Methods: Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis. Results: The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models. Conclusion: Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a mar-ginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with Ada-Boost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.. © 2024 Bentham Science Publishers.
引用
收藏
页码:21 / 37
页数:16
相关论文
共 50 条
  • [21] Whole Body Vibration in open pit mining-a short review
    Bernardo, C.
    Matos, M. L.
    Santos Baptista, J.
    OCCUPATIONAL SAFETY AND HYGIENE II, 2014, : 459 - 464
  • [22] Public attitudes and sentiments toward ChatGPT in China: A text mining analysis based on social media
    Lian, Ying
    Tang, Huiting
    Xiang, Mengting
    Dong, Xuefan
    TECHNOLOGY IN SOCIETY, 2024, 76
  • [23] Understanding social engagements: A comparative analysis of user and text features in Twitter
    Toraman, Cagri
    Sahinuc, Furkan
    Yilmaz, Eyup Halit
    Akkaya, Ibrahim Batuhan
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [24] A Comparative Analysis of Active Learning for Biomedical Text Mining
    Naseem, Usman
    Khushi, Matloob
    Khan, Shah Khalid
    Shaukat, Kamran
    Moni, Mohammad Ali
    APPLIED SYSTEM INNOVATION, 2021, 4 (01)
  • [25] Demonetization and its aftermath: an analysis based on twitter sentiments
    Ray, Paramita
    Chakrabarti, Amlan
    Ganguli, Bhaswati
    Das, Pranab Kumar
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (11):
  • [26] Comparative Analysis of Three Typical Text Mining Software
    You, Xiaguang
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON SOFT COMPUTING IN INFORMATION COMMUNICATION TECHNOLOGY, 2014, : 61 - 63
  • [27] Understanding social engagements: A comparative analysis of user and text features in Twitter
    Cagri Toraman
    Furkan Şahinuç
    Eyup Halit Yilmaz
    Ibrahim Batuhan Akkaya
    Social Network Analysis and Mining, 2022, 12
  • [28] Demonetization and its aftermath: an analysis based on twitter sentiments
    Paramita Ray
    Amlan Chakrabarti
    Bhaswati Ganguli
    Pranab Kumar Das
    Sādhanā, 2018, 43
  • [29] Causality Analysis of Twitter Sentiments and Stock Market Returns
    Tabari, Narges
    Biswas, Piyusha
    Praneeth, Bhanu
    Seyeditabari, Armin
    Hadzikadic, Mirsad
    Zadrozny, Wlodek
    ECONOMICS AND NATURAL LANGUAGE PROCESSING (ECONLP 2018), 2018, : 11 - 19
  • [30] Emotions and Topics Expressed on Twitter During the COVID-19 Pandemic in the United Kingdom: Comparative Geolocation and Text Mining Analysis
    Alhuzali, Hassan
    Zhang, Tianlin
    Ananiadou, Sophia
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (10)