SWIMS: Semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis

被引:36
|
作者
Khan, Farhan Hassan [1 ]
Qamar, Usman [1 ]
Bashir, Saba [1 ]
机构
[1] Natl Univ Sci & Technol, Coll Elect & Mech Engn, Dept Comp Engn, Islamabad, Pakistan
关键词
Sentiment analysis; Natural Language Processing (NLP); Movie reviews; Cornell; Feature selection; Support Vector Machine; CLASSIFICATION; LEXICON; ALGORITHMS; EXTRACTION; FRAMEWORK;
D O I
10.1016/j.knosys.2016.02.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment Analysis, also called Opinion Mining, is currently one of the most studied research fields. Its aim is to analyze publics' sentiments, opinions, attitudes etc., towards different elements such as topics, products, individuals, organizations, or services. Sentiment classification can be achieved by machine learning or lexical based methodologies or a combination of both. In an effort to improve the performance of domain independent lexicons, this research incorporates machine learning with a lexical based approach introducing a new framework called SWIMS to determine the feature weight based on a well-known general-purpose sentiment lexicon, SentiWordNet. Support vector machine is used to learn the feature weights and an intelligent model selection approach is employed in order to enhance the classification performance. The features are selected based on their subjectivity and the effects of feature selection with respect to their part of speech information are studied extensively. Seven benchmark datasets have been used in this research including large movie review dataset, multi-domain sentiment dataset and Cornell movie review dataset, all of which are available online. In-depth performance comparison is conducted with the state of art machine learning approaches and lexical based methodologies. The evaluation of performance measures proves that the proposed framework outperforms other techniques for sentiment analysis. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:97 / 111
页数:15
相关论文
共 50 条
  • [1] Weighting Based Approach for Semi-supervised Feature Selection
    Benabdeslem, Khalid
    Hindawi, Mohammed
    Makkhongkaew, Raywat
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 300 - 307
  • [2] An improved semi-supervised dimensionality reduction using feature weighting: Application to sentiment analysis
    Kim, Kyoungok
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 : 49 - 65
  • [3] Constrained feature weighting for semi-supervised learning
    Chen, Xinyi
    Zhang, Li
    Zhao, Lei
    Zhang, Xiaofang
    [J]. APPLIED INTELLIGENCE, 2024, 54 (20) : 9987 - 10006
  • [4] Semi-Supervised Feature Selection with Adaptive Discriminant Analysis
    Zhong, Weichan
    Chen, Xiaojun
    Yuan, Guowen
    Li, Yiqin
    Nie, Feiping
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10083 - 10084
  • [5] Semi-supervised Feature Selection via Spectral Analysis
    Zhao, Zheng
    Liu, Huan
    [J]. PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 641 - 646
  • [6] Adaptive discriminant analysis for semi-supervised feature selection
    Zhong, Weichan
    Chen, Xiaojun
    Nie, Feiping
    Huang, Joshua Zhexue
    [J]. INFORMATION SCIENCES, 2021, 566 : 178 - 194
  • [7] Forward semi-supervised feature selection
    Ren, Jiangtao
    Qiu, Zhengyuan
    Fan, Wei
    Cheng, Hong
    Yu, Philip S.
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 970 - +
  • [8] Feature Selection and Model Optimization for Semi-supervised Speaker Spotting
    Chetupalli, Srikanth Raj
    Gopalakrishnan, Anand
    Sreenivas, Thippur V.
    [J]. 2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 135 - 139
  • [9] Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection
    Ang, Jun Chin
    Mirzal, Andri
    Haron, Habibollah
    Hamed, Haza Nuzly Abdull
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 971 - 989
  • [10] A Survey on semi-supervised feature selection methods
    Sheikhpour, Razieh
    Sarram, Mehdi Agha
    Gharaghani, Sajjad
    Chahooki, Mohammad Ali Zare
    [J]. PATTERN RECOGNITION, 2017, 64 : 141 - 158