Comparison Research on Text Pre-processing Methods on Twitter Sentiment Analysis

被引:151
|
作者
Zhao Jianqiang [1 ,2 ,3 ]
Gui Xiaolin [1 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian 710049, Peoples R China
[2] Xian Polit Inst, Xian 710068, Shaanxi Provinc, Peoples R China
[3] Key Lab Comp Network Shaanxi Prov, Xian 710049, Peoples R China
来源
IEEE ACCESS | 2017年 / 5卷
关键词
Twitter; sentiment analysis; text pre-processing;
D O I
10.1109/ACCESS.2017.2672677
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter sentiment analysis offers organizations ability to monitor public feeling towards the products and events related to them in real time. The first step of the sentiment analysis is the text preprocessing of Twitter data. Most existing researches about Twitter sentiment analysis are focused on the extraction of new sentiment features. However, to select the pre-processing method is ignored. This paper discussed the effects of text pre-processing method on sentiment classification performance in two types of classification tasks, and summed up the classification performances of six pre-processing methods using two feature models and four classifiers on five Twitter datasets. The experiments show that the accuracy and F1-measure of Twitter sentiment classification classifier are improved when using the pre-processing methods of expanding acronyms and replacing negation, but barely changes when removing URLs, removing numbers or stop words. The Naive Bayes and Random Forest classifiers are more sensitive than Logistic Regression and support vector machine classifiers when various pre-processing methods were applied.
引用
收藏
页码:2870 / 2879
页数:10
相关论文
共 50 条
  • [1] Role of Text Pre-Processing in Twitter Sentiment Analysis
    Singh, Tajinder
    Kumari, Madhu
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 549 - 554
  • [2] A Comparison of Pre-processing Techniques for Twitter Sentiment Analysis
    Effrosynidis, Dimitrios
    Symeonidis, Symeon
    Arampatzis, Avi
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES (TPDL 2017), 2017, 10450 : 394 - 406
  • [3] Pre-processing Boosting Twitter Sentiment Analysis?
    Zhao Jianqiang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 748 - 753
  • [4] The Role of Pre-processing in Twitter Sentiment Analysis
    Bao, Yanwei
    Quan, Changqin
    Wang, Lijuan
    Ren, Fuji
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, 2014, 8589 : 615 - 624
  • [5] The Role of Text Pre-processing in Sentiment Analysis
    Haddi, Emma
    Liu, Xiaohui
    Shi, Yong
    [J]. FIRST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2013, 17 : 26 - 32
  • [6] Pre-processing Analysis for Chinese Text Sentiment Analysis
    Li, Ang
    Chen, Yunfang
    [J]. PROCEEDINGS OF 2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION SYSTEMS (ICCIS 2017), 2015, : 318 - 323
  • [7] Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis
    Palomino, Marco A.
    Aider, Farida
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [8] Pre-processing Framework for Twitter Sentiment Classification
    Dritsas, Elias
    Vonitsanos, Gerasimos
    Livieris, Ioannis E.
    Kanavo, Andreas
    Ilias, Aristidis
    Makris, Christos
    Tsakalidis, Athanasios
    [J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS (AIAI 2019), 2019, 560 : 138 - 149
  • [9] A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis
    Symeonidis, Symeon
    Effrosynidis, Dimitrios
    Arampatzis, Avi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 110 : 298 - 310
  • [10] ANALYSIS OF DATA PRE-PROCESSING METHODS FOR SENTIMENT ANALYSIS OF REVIEWS
    Parlar, Tuba
    Ozel, Selma Ayse
    Song, Fei
    [J]. COMPUTER SCIENCE-AGH, 2019, 20 (01): : 123 - 141