Sentiment classification of skewed shoppers' reviews using machine learning techniques, examining the textual features

被引:5
|
作者
Rezapour, Mahdi [1 ]
机构
[1] Wyoming Technol Transfer Ctr, 1000 E Univ Ave,Dept 3295, Laramie, WY 82071 USA
关键词
machine learning technique; opinion mining; polarity; opinion extraction; review classification; sentiment analysis; text classification|Natural language processing; ONLINE REVIEWS;
D O I
10.1002/eng2.12280
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the speedy growth of online shopping, it has become of crucial importance for product makers to analyze, and handle a wealth of products' reviews. However, such a high volume of reviews, along with a wide variety of opinions, makes it hard for manufacturers to know exactly how they can improve their products without having an efficient approach. For this purpose, the results of sentiment classification would help the customers to retrieve the necessary information to choose an appropriate product, and the sellers to effectively collect customer feedback in order to improve their products. Like most of the read-world problems, the shopping review data being used in this study were imbalanced, being predominately composed of positive with only a small percentage of negative reviews. Machine learning (ML) algorithms do not perform well when data are imbalanced, as they tend to get biased toward the overrepresented data category. The synthetic minority over-sampling technique (SMOTE) was used to address this class imbalance problem. In this study, three different ML-based algorithms, namely the Naive Bayes (NB), Support Vector Machine, and decision tree (DT) were employed. An extensive preprocessing procedure was taken to prepare the text datasets, and details are discussed in the manuscript. The performance analysis indicated that the DT algorithm outperforms the other two methods. As positive reviews account for the majority of the reviews, sparse words removal for the data resulted in the removal of almost all negative reviews' sentiments. Hence, the model training process is here performed on positive and negative reviews separately. A combination of the review titles with their contents, separate tokenization process, applications of various N-gram, and maintaining stops words (e.g. "not" or "but") were some other steps considered to improve the performance of the model.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Sentiment classification on product reviews using machine learning and deep learning techniques
    Singh, Neha
    Jaiswal, Umesh Chandra
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (12) : 5726 - 5741
  • [2] Sentiment Classification Using Machine Learning Techniques with Syntax Features
    Zou, Huang
    Tang, Xinhuai
    Xie, Bin
    Liu, Bing
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2015, : 175 - 179
  • [3] Sentiment Analysis of Restaurant Reviews Using Machine Learning Techniques
    Krishna, Akshay
    Akhilesh, V.
    Aich, Animikh
    Hegde, Chetana
    EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 687 - 696
  • [4] Sentiment Analysis and Classification of Restaurant Reviews using Machine Learning
    Zahoor, Kanwal
    Bawany, Narmeen Zakaria
    Hamid, Soomaiya
    2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
  • [5] A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques
    Basiri, Mohammad Ehsan
    Abdar, Moloud
    Cifci, Mehmet Akif
    Nemati, Shahla
    Acharya, U. Rajendra
    KNOWLEDGE-BASED SYSTEMS, 2020, 198
  • [6] Classification of Sentimental Reviews Using Machine Learning Techniques
    Tripathy, Abinash
    Agrawal, Ankit
    Rath, Santanu Kumar
    3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 : 821 - 829
  • [7] Sentiment Classification for Film Reviews in Gujarati Text Using Machine Learning and Sentiment Lexicons
    Shah, Parita
    Swaminarayan, Priya
    Patel, Maitri
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2022, 17 (01) : 1 - 16
  • [8] Analysis of sentiment based movie reviews using machine learning techniques
    Chirgaiya, Sachin
    Sukheja, Deepak
    Shrivastava, Niranjan
    Rawat, Romil
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (05) : 5449 - 5456
  • [9] Classification of Sentiment Reviews for Indian Railways Using Machine Learning Methods
    Bagga, Manju
    Aggarwa, Ritu
    Arora, Nitika
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 1, 2023, 473 : 171 - 177
  • [10] Sentiment Analysis for Arabic Reviews using Machine Learning Classification Algorithms
    Sayed, Awny A.
    Elgeldawi, Enas
    Zaki, Alaa M.
    Galal, Ahmed R.
    PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMMUNICATION AND COMPUTER ENGINEERING (ITCE), 2020, : 56 - 63