Weakly supervised topic sentiment joint model with word embeddings

被引:28
|
作者
Fu, Xianghua [1 ]
Sun, Xudong [1 ]
Wu, Haiying [1 ]
Cui, Laizhong [1 ]
Huang, Joshua Zhexue [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
关键词
Sentiment analysis; Topic model; Topic sentiment joint model; Word embeddings;
D O I
10.1016/j.knosys.2018.02.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic sentiment joint model aims to deal with the problem about the mixture of topics and sentiment simultaneously from online reviews. Most of existing topic sentiment modeling algorithms are mainly based on the state-of-art latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA), which infer sentiment and topic distributions from the co-occurrence of words. These methods have been proposed and successfully used for topic and sentiment analysis. However, when the training corpus is small or when the documents are short, the textual features become sparse, so that the results of the sentiment and topic distributions might be not very satisfied. In this paper, we propose a novel topic sentiment joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embeddings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition. The main contributions of WS-TSWE include the following two aspects. (1) Existing models generate the words only from the sentiment-topic-to-word Dirichlet multinomial component, but the WS-TSWE model replaces it with a mixture of two components, a Dirichlet multinomial component and a word embeddings component. Since the word embeddings are trained on a very large corpora and can be used to extend the semantic information of the words, they can provide a certain solution for the problem of the textual sparse. (2) Most of previous models incorporate sentiment knowledge in the beta priors. And the priors are usually set from a dictionary and completely rely on previous domain knowledge to identify positive and negative words. In contrast, the WS-TSWE model calculates the sentiment orientation of each word with the HowNet lexicon and automatically infers sentiment-based beta priors for sentiment analysis and opinion mining. Furthermore, we implement WS-TSWE with Gibbs sampling algorithms. The experimental results on Chinese and English data sets show that WS-TSWE achieved significant performance in the task of detecting sentiment and topics simultaneously. (c) 2018 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:43 / 54
页数:12
相关论文
共 50 条
  • [1] A weakly-supervised graph-based joint sentiment topic model for multi-topic sentiment analysis
    Zhou, Tao
    Law, Kris
    Creighton, Douglas
    INFORMATION SCIENCES, 2022, 609 : 1030 - 1051
  • [2] WEAKLY SUPERVISED SENTIMENT ANALYSIS USING JOINT SENTIMENT TOPIC DETECTION WITH BIGRAMS
    Pavitra, R.
    Kalaivaani, P. C. D.
    2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 889 - 893
  • [3] Weakly Supervised Joint Sentiment-Topic Detection from Text
    Lin, Chenghua
    He, Yulan
    Everson, Richard
    Rueger, Stefan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (06) : 1134 - 1145
  • [4] Weakly Supervised Feature Compression Based Topic Model for Sentiment Classification
    Hu, Yan
    Xu, Xiaofei
    Li, Li
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2017): 10TH INTERNATIONAL CONFERENCE, KSEM 2017, MELBOURNE, VIC, AUSTRALIA, AUGUST 19-20, 2017, PROCEEDINGS, 2017, 10412 : 29 - 41
  • [5] Identifying Rare and Subtle Behaviors: A Weakly Supervised Joint Topic Model
    Hospedales, Timothy M.
    Li, Jian
    Gong, Shaogang
    Xiang, Tao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (12) : 2451 - 2464
  • [6] Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding
    Huang, Jiaxin
    Meng, Yu
    Guo, Fang
    Ji, Heng
    Han, Jiawei
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6989 - 6999
  • [7] Probabilistic Relational Supervised Topic Modelling using Word Embeddings
    Al-Ani, Jabir Alshehabi
    Fasli, Maria
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2035 - 2043
  • [8] Improving biterm topic model with word embeddings
    Jiajia Huang
    Min Peng
    Pengwei Li
    Zhiwei Hu
    Chao Xu
    World Wide Web, 2020, 23 : 3099 - 3124
  • [9] A Correlated Topic Model Using Word Embeddings
    Xun, Guangxu
    Li, Yaliang
    Zhao, Wayne Xin
    Gao, Jing
    Zhang, Aidong
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4207 - 4213
  • [10] Improving biterm topic model with word embeddings
    Huang, Jiajia
    Peng, Min
    Li, Pengwei
    Hu, Zhiwei
    Xu, Chao
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (06): : 3099 - 3124