Estimating Sentiment via Probability and Information Theory

被引:9
|
作者
Labille, Kevin [1 ]
Alfarhood, Sultan [1 ]
Gauch, Susan [1 ]
机构
[1] Univ Arkansas, Dept Comp Sci & Comp Engn, Fayetteville, AR 72701 USA
关键词
Lexicons; Sentiment Analysis; Data Mining; Text Mining; Opinion Mining;
D O I
10.5220/0006072101210129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Opinion detection and opinion analysis is a challenging but important task. Such sentiment analysis can be done using traditional supervised learning methods such as naive Bayes classification and support vector machines (SVM) or unsupervised approaches based on a lexicon may be employed. Because lexicon-based sentiment analysis methods make use of an opinion dictionary that is a list of opinion-bearing or sentiment words, sentiment lexicons play a key role. Our work focuses on the task of generating such a lexicon. We propose several novel methods to automatically generate a general-purpose sentiment lexicon using a corpus-based approach. While most existing methods generate a lexicon using a list of seed sentiment words and a domain corpus, our work differs from these by generating a lexicon from scratch using probabilistic techniques and information theoretical text mining techniques on a large diverse corpus. We conclude by presenting an ensemble method that combines the two approaches. We evaluate and demonstrate the effectiveness of our methods by utilizing the various automatically-generated lexicons during sentiment analysis. When used for sentiment analysis, our best single lexicon achieves an accuracy of 87.60% and the ensemble approach achieves 88.75% accuracy, both statistically significant improvements over 81.60% with a widely-used sentiment lexicon.
引用
收藏
页码:121 / 129
页数:9
相关论文
共 50 条
  • [1] Building a Restaurant-Specific Sentiment Lexicon via Probability Theory
    de Melo, Tiago
    PROCEEDINGS OF THE 27TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA '21), 2021, : 129 - 132
  • [2] LOGICAL BASIS FOR INFORMATION THEORY AND PROBABILITY THEORY
    KOLMOGOROV, AN
    IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (05) : 662 - +
  • [3] Sentiment Classification via Supplementary Information Modeling
    Xu, Zenan
    Fu, Yetao
    Chen, Xingming
    Rao, Yanghui
    Xie, Haoran
    Wang, Fu Lee
    Peng, Yang
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 54 - 62
  • [4] PROBABILITY AND INFORMATION THEORY WITH APPLICATIONS TO RADAR
    不详
    POST OFFICE ELECTRICAL ENGINEERS JOURNAL, 1965, 58 : 139 - &
  • [5] INFORMATION THEORY AND INVERSE PROBABILITY IN TELECOMMUNICATION
    WOODWARD, PM
    DAVIES, IL
    PROCEEDINGS OF THE INSTITUTION OF ELECTRICAL ENGINEERS-LONDON, 1952, 99 (58): : 37 - 44
  • [6] Algorithmic Information Theory and Foundations of Probability
    Shen, Alexander
    REACHABILITY PROBLEMS, PROCEEDINGS, 2009, 5797 : 26 - 34
  • [7] PROBABILITY AND INFORMATION THEORY WITH APPLICATIONS TO RADAR
    不详
    MARCONI REVIEW, 1966, 29 (160): : 56 - &
  • [8] PROBABILITY AND INFORMATION THEORY WITH APPLICATIONS TO RADAR
    MARKO, H
    ARCHIV DER ELEKTRISCHEN UND UBERTRAGUNG, 1965, 19 (08): : 410 - &
  • [9] Information theory and complexity in probability and statistics
    Topsoe, F
    SOFT METHODOLOGY AND RANDOM INFORMATION SYSTEMS, 2004, : 363 - 370
  • [10] Estimating Power for FPGAs Based on Signal Probability Theory
    Jun-Shi Wang
    Le-Tian Huang
    Hui Dong
    Terrence Mak
    Journal of Electronic Science and Technology, 2012, (04) : 302 - 308