Sentiment Classification Using Feature Selection Techniques for Text Data Composed of Heterogeneous Sources

被引:0
|
作者
Arya V. [1 ]
Agrawal R. [1 ]
机构
[1] Manav Rachna International Institute of Research & Studies, Faridabad
关键词
bag of word; Feature selection; heterogeneous source; machine learning; sentiment classifier; TF-IDF; Word2Vec;
D O I
10.2174/2666255813999200818133555
中图分类号
学科分类号
摘要
Aims: This study analyzes feature selection techniques for text data composed of heterogeneous sources for sentiment classification Objectives: The objective of work is to analyze the feature selection technique for text gathered from different sources to increase the accuracy of sentiment classification done on microblogs. Methods: Three feature selection techniques Bag-of-Word(BOW), TF-IDF, and word2vector were applied to find the most suitable feature selection techniques for heterogeneous datasets. Results: TF-IDF outperforms all of the three selected feature selection techniques for sentiment classification with SVM classifier. Conclusion: Feature selection is an integral part of any data preprocessing task, and along with that, it is also important for the machine learning algorithms to achieve good accuracy in classification results. Hence it is essential to find out the best suitable approach for heterogeneous sources of data. The heterogeneous sources are rich sources of information and they also play an important role in developing a model for adaptable systems as well. So keeping that also in mind, we compared the three techniques for heterogeneous source data and found that TF-IDF is the most suitable one for all types of data, whether it is balanced or imbalanced data, it is a single source or multiple source data. In all cases, the TF-IDF approach is the most promising approach in generating the results for the classification of sentiments of users. © 2022 Bentham Science Publishers.
引用
收藏
页码:207 / 214
页数:7
相关论文
共 50 条
  • [11] A Comprehensive Survey on Effective Feature Selection Approaches for Text Sentiment Classification Process
    Rajpoot, Abha Kiran
    Nand, Parma
    Abidi, Ali Imam
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 971 - 977
  • [12] A feature selection model based on genetic rank aggregation for text sentiment classification
    Onan, Aytug
    Korukoglu, Serdar
    JOURNAL OF INFORMATION SCIENCE, 2017, 43 (01) : 25 - 38
  • [13] Using Feature Selection in Combination with Ensemble Learning Techniques to Improve Tweet Sentiment Classification Performance
    Prusa, Joseph D.
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 186 - 193
  • [14] Text sentiment classification based on feature fusion
    Zhang, Chen
    Li, Qingxu
    Cheng, Xue
    Revue d'Intelligence Artificielle, 2020, 34 (04) : 515 - 520
  • [15] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [16] Sentiment classification using hybrid feature selection and ensemble classifier
    Jain, Achin
    Jain, Vanita
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 659 - 668
  • [17] Feature Selection For Text Classification Using Genetic Algorithms
    Bidi, Noria
    Elberrichi, Zakaria
    PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION & CONTROL (ICMIC 2016), 2016, : 806 - 810
  • [18] Feature Selection by Using Heuristic Methods for Text Classification
    Sel, Ilhami
    Yeroglu, Celalettin
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [19] Feature Selection for Text Classification Using Mutual Information
    Sel, Ilhami
    Karci, Ali
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [20] Improve Abstract Data with Feature Selection for Classification Techniques
    Nuipian, Vatinee
    Meesad, Phayung
    Boonrawd, Pudsadee
    FUTURE INFORMATION TECHNOLOGY, 2011, 13 : 213 - 217