Constructing a Domain Sentiment Lexicon Based on Chinese Social Media Text

被引:0
|
作者
Jiang C. [1 ,2 ]
Guo Y. [1 ]
Liu Y. [1 ]
机构
[1] School of Management, Hefei University of Technology, Hefei
[2] Key Laboratory of Process Optimization and Intelligent Decision-making of Ministry of Education, Hefei
关键词
PMI; Sentiment Analysis; Sentiment Lexicon; Social Media; Word2Vec;
D O I
10.11925/infotech.2096-3467.2018.0578
中图分类号
学科分类号
摘要
[Objective] This study aims to construct a domain sentiment lexicon by discovering unrecognized sentiment words from user-generated contents on Chinese social media to apply it to automotive comments sentiment analysis. [Methods] First, words in HowNet are selected as the seeds, and PMI and Word2Vec algorithm are used to calculate the sentiment polarity of the candidates respectively on real automative corpus. Then the results of the two discriminations are judged synthetically according to the ensemble rules. Finally the proposed method was shown effective by the comparison of the sentiment classification experiments. [Results] The accuracy rate of the lexicon constructed according to proposed method is 21.6% higher than that of HowNet. The lexicon constructed by PMI and Word2Vec respectively increase 3.7% and 2.1%. Meanwhile the number of positive and negative emotional words are greatly increased. [Limitations] The source of corpus is single, and it has certain limitations in guiding other fields. [Conclusions] The sentiment lexicon constructed by this method can be applied to sentiment analysis of social media texts effectively. © 2019 The Author(s).
引用
收藏
页码:98 / 107
页数:9
相关论文
共 25 条
  • [1] Liu B., Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies, pp. 152-153, (2012)
  • [2] Hogenboom A, Heerschop B, Frasincar F, Et al., Multi-lingual Support for Lexicon-based Sentiment Analysis Guided by Semantics, Decision Support Systems, 62, 2, pp. 43-53, (2014)
  • [3] Wu F, Huang Y, Song Y, Et al., Towards Building a High-quality Microblog-specific Chinese Sentiment Lexicon, Decision Support Systems, 87, pp. 39-49, (2016)
  • [4] Fellbaum C, Miller G., WordNet: An Electronic Lexical Database, (1998)
  • [5] Stone P J, Dunphy D C, Smith M S., The General Inquirer: A Computer Approach to Content Analysis, Information Storage & Retrieval, 4, 4, pp. 375-376, (1966)
  • [6] Dong Z, Dong Q., HowNet - A Hybrid Language and Knowledge Resource[C], Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, (2003)
  • [7] Wang Ke, Xia Rui, A Survey on Automatical Construction Methods of Sentiment Lexicons, Acta Automatica Sinica, 42, 4, pp. 495-511, (2016)
  • [8] Loughran T, Mcdonald B., When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10‐Ks, Journal of Finance, 66, 1, pp. 35-65, (2011)
  • [9] Church K W, Hanks P., Word Association Norms, Mutual Information, and Lexicography, Computational Linguistics, 16, 1, pp. 76-83, (1990)
  • [10] Turney P D, Littman M L., Measuring Praise and Criticism: Inference of Semantic Orientation from Association, ACM Transactions on Information Systems, 21, 4, pp. 315-346, (2003)