Opinion mining on large scale data using sentiment analysis and k-means clustering

被引:43
|
作者
Riaz, Sumbal [1 ]
Fatima, Mehvish [1 ]
Kamran, M. [1 ]
Nisar, M. Wasif [1 ]
机构
[1] COMSATS Inst Informat Technol, Dept Comp Sci, Wah Cantt, Pakistan
关键词
Heterogeneous data processing; Imbalanced learning; Intelligent computing; CLASSIFICATION; ALGORITHMS; LEXICON; WORDS;
D O I
10.1007/s10586-017-1077-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.
引用
收藏
页码:S7149 / S7164
页数:16
相关论文
共 50 条
  • [1] Opinion mining on large scale data using sentiment analysis and k-means clustering
    Sumbal Riaz
    Mehvish Fatima
    M. Kamran
    M. Wasif Nisar
    Cluster Computing, 2019, 22 : 7149 - 7164
  • [2] Large scale K-means clustering using GPUs
    Li, Mi
    Frank, Eibe
    Pfahringer, Bernhard
    DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (01) : 67 - 109
  • [3] Large scale K-means clustering using GPUs
    Mi Li
    Eibe Frank
    Bernhard Pfahringer
    Data Mining and Knowledge Discovery, 2023, 37 : 67 - 109
  • [4] Fast K-means for Large Scale Clustering
    Hu, Qinghao
    Wu, Jiaxiang
    Bai, Lu
    Zhang, Yifan
    Cheng, Jian
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2099 - 2102
  • [5] Parallelization of K-Means Clustering Algorithm for Data Mining
    Jiang, Hao
    Yu, Liyan
    4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
  • [6] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
  • [7] A large scale clustering scheme for kernel K-Means
    Zhang, R
    Rudnicky, AI
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITON, VOL IV, PROCEEDINGS, 2002, : 289 - 292
  • [8] Scalable k-means for large-scale clustering
    Ming, Yuewei
    Zhu, En
    Wang, Mao
    Liu, Qiang
    Liu, Xinwang
    Yin, Jianping
    INTELLIGENT DATA ANALYSIS, 2019, 23 (04) : 825 - 838
  • [9] Compressed K-Means for Large-Scale Clustering
    Shen, Xiaobo
    Liu, Weiwei
    Tsang, Ivor
    Shen, Fumin
    Sun, Quan-Sen
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2527 - 2533
  • [10] Applying K-Means Clustering Algorithm Using Oracle Data Mining to Banking Data
    Hilala, Jafarova
    Rovshan, Aliyev
    PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT, 2015, 362 : 809 - 816