Constructing a Domain Sentiment Lexicon Based on Chinese Social Media Text

被引:0
|
作者
Jiang C. [1 ,2 ]
Guo Y. [1 ]
Liu Y. [1 ]
机构
[1] School of Management, Hefei University of Technology, Hefei
[2] Key Laboratory of Process Optimization and Intelligent Decision-making of Ministry of Education, Hefei
关键词
PMI; Sentiment Analysis; Sentiment Lexicon; Social Media; Word2Vec;
D O I
10.11925/infotech.2096-3467.2018.0578
中图分类号
学科分类号
摘要
[Objective] This study aims to construct a domain sentiment lexicon by discovering unrecognized sentiment words from user-generated contents on Chinese social media to apply it to automotive comments sentiment analysis. [Methods] First, words in HowNet are selected as the seeds, and PMI and Word2Vec algorithm are used to calculate the sentiment polarity of the candidates respectively on real automative corpus. Then the results of the two discriminations are judged synthetically according to the ensemble rules. Finally the proposed method was shown effective by the comparison of the sentiment classification experiments. [Results] The accuracy rate of the lexicon constructed according to proposed method is 21.6% higher than that of HowNet. The lexicon constructed by PMI and Word2Vec respectively increase 3.7% and 2.1%. Meanwhile the number of positive and negative emotional words are greatly increased. [Limitations] The source of corpus is single, and it has certain limitations in guiding other fields. [Conclusions] The sentiment lexicon constructed by this method can be applied to sentiment analysis of social media texts effectively. © 2019 The Author(s).
引用
收藏
页码:98 / 107
页数:9
相关论文
共 25 条
  • [21] Kittler J V, Hatef M, Duin R W, Et al., On Combining Classfiers, IEEE Transactions on Pattern Analysis & Machine Intelligence, 20, 3, pp. 226-239, (1998)
  • [22] Huang Wei, Fan Lei, Semi-supervised Sentiment Classification Based on Ensemble Learning with Voting, Journal of Chinese Information Processing, 30, 2, pp. 41-49, (2016)
  • [23] Sun Z, Song Q, Zhu X, Et al., A Novel Ensemble Method for Classifying Imbalanced Data, Pattern Recognition, 48, 5, pp. 1623-1637, (2015)
  • [24] Li Y, Guo H, Liu X, Et al., Adapted Ensemble Classification Algorithm Based on Multiple Classifier System and Feature Selection for Classifying Multi-class Imbalanced Data, Knowledge-Based Systems, 94, pp. 88-104, (2016)
  • [25] Chalothorn T, Ellman J., Sentiment Analysis of Web Forums: Comparison Between Sentiwordnet and Sentistrength, Proceedings of the 2012 International Conference on Software Technology and Engineering, (2012)