Dimensionality Reduction for Sentiment Analysis using Pre-processing Techniques

被引:0
|
作者
Mhatre, Mayuri [1 ]
Phondekar, Dakshata [1 ]
Kadam, Pranali [1 ]
Chawathe, Anushka [1 ]
Ghag, Kranti [1 ]
机构
[1] SAKEC, Informat Technol Dept, Bombay, Maharashtra, India
关键词
Sentiment Analysis; Pre-processing; Slangs Handling; Stopwords Removal; Lemmatization;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sentiment analysis is the study of people's opinions, sentiments, attitudes and emotions, expressed in written language but this process is time consuming, inconsistent and costly in business context. Pre-processing the data will help to ease this difficulty. Pre-processing is the process of cleaning and preparing the text for its analysis using pre-processing techniques. The existing pre-processing techniques are Handling Expressive Lengthening, Emoticons Handling, HTML Tags Removal, Punctuations Handling, Slangs Handling, Stopwords Removal, Stemming and Lemmatization. In this paper, the effect of various pre-processing techniques and their combinations was analyzed on the dataset taken from Kaggle called Bag of Words Meets Bags of Popcorn. By taking every possible combination of pre-processing techniques, the aim was to find the one giving highest accuracy. Random Forest Classifier was used to predict sentiments as it is known to give good accuracy and the result was evaluated using 10 fold cross validation method. Accuracy increased from unprocessed data to pre-processed data. It was concluded that using pre-processing techniques gives a higher accuracy than the traditional approach i.e. no pre-processing.
引用
收藏
页码:16 / 21
页数:6
相关论文
共 50 条
  • [1] A Comparison of Pre-processing Techniques for Twitter Sentiment Analysis
    Effrosynidis, Dimitrios
    Symeonidis, Symeon
    Arampatzis, Avi
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES (TPDL 2017), 2017, 10450 : 394 - 406
  • [2] A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis
    Symeonidis, Symeon
    Effrosynidis, Dimitrios
    Arampatzis, Avi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 110 : 298 - 310
  • [3] The Role of Text Pre-processing in Sentiment Analysis
    Haddi, Emma
    Liu, Xiaohui
    Shi, Yong
    [J]. FIRST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2013, 17 : 26 - 32
  • [4] The Role of Pre-processing in Twitter Sentiment Analysis
    Bao, Yanwei
    Quan, Changqin
    Wang, Lijuan
    Ren, Fuji
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, 2014, 8589 : 615 - 624
  • [5] Pre-processing Boosting Twitter Sentiment Analysis?
    Zhao Jianqiang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 748 - 753
  • [6] Pre-processing Analysis for Chinese Text Sentiment Analysis
    Li, Ang
    Chen, Yunfang
    [J]. PROCEEDINGS OF 2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION SYSTEMS (ICCIS 2017), 2015, : 318 - 323
  • [7] The Effect of Pre-processing Techniques on the Accuracy of Sentiment Analysis Using Bag-of-Concepts Text Representation
    Mehanna Y.S.
    Mahmuddin M.
    [J]. SN Computer Science, 2021, 2 (4)
  • [8] Role of Text Pre-Processing in Twitter Sentiment Analysis
    Singh, Tajinder
    Kumari, Madhu
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 549 - 554
  • [9] Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis
    Palomino, Marco A.
    Aider, Farida
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [10] ANALYSIS OF DATA PRE-PROCESSING METHODS FOR SENTIMENT ANALYSIS OF REVIEWS
    Parlar, Tuba
    Ozel, Selma Ayse
    Song, Fei
    [J]. COMPUTER SCIENCE-AGH, 2019, 20 (01): : 123 - 141