Opinion mining on large scale data using sentiment analysis and k-means clustering

被引：43

作者：

Riaz, Sumbal ^{[1
]}

Fatima, Mehvish ^{[1
]}

Kamran, M. ^{[1
]}

Nisar, M. Wasif ^{[1
]}

机构：

[1] COMSATS Inst Informat Technol, Dept Comp Sci, Wah Cantt, Pakistan

来源：

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2019年 / 22卷 / Suppl 3期

关键词：

Heterogeneous data processing; Imbalanced learning; Intelligent computing; CLASSIFICATION; ALGORITHMS; LEXICON; WORDS;

D O I：

10.1007/s10586-017-1077-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.

引用

页码：S7149 / S7164

页数：16

共 50 条

[21] Extractive Text Summarization on Large-scale Dataset Using K-Means Clustering
Ti-Hon Nguyen
Thanh-Nghi Do
ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND PRACTICES IN ARTIFICIAL INTELLIGENCE, 2022, 13343 : 737 - 746
[22] Distributed threshold k-means clustering for privacy preserving data mining
Baby, Vadlana
Chandra, N. Subhash
2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2286 - 2289
[23] Predictive tools in data mining and k-means clustering: Universal Inequalities
Hamzeh Agahi
A. Mohammadpour
S. Mansour Vaezpour
Results in Mathematics, 2013, 63 : 779 - 803
[24] Clustering the Patent Data Using K-Means Approach
Anuranjana
Mittas, Nisha
Mehrotra, Deepti
SOFTWARE ENGINEERING (CSI 2015), 2019, 731 : 639 - 645
[25] Unsupervised segmentation of large scale spatial images using K-means clustering approach
Luo, JC
Ye, ZM
Bhattacharya, P
Proceedings of the Eighth IASTED International Conference on Intelligent Systems and Control, 2005, : 410 - 415
[26] Predictive tools in data mining and k-means clustering: Universal Inequalities
Agahi, Hamzeh
Mohammadpour, A.
Vaezpour, S. Mansour
RESULTS IN MATHEMATICS, 2013, 63 (3-4) : 779 - 803
[27] Genetic weighted k-means algorithm for clustering large-scale gene expression data
Wu, Fang-Xiang
BMC BIOINFORMATICS, 2008, 9 (Suppl 6)
[28] Genetic weighted k-means algorithm for clustering large-scale gene expression data
Fang-Xiang Wu
BMC Bioinformatics, 9
[29] Hierarchical K-means Method for Clustering Large-Scale Advanced Metering Infrastructure Data
Xu, Tian-Shi
Chiang, Hsiao-Dong
Liu, Guang-Yi
Tan, Chin-Woo
IEEE TRANSACTIONS ON POWER DELIVERY, 2017, 32 (02) : 609 - 616
[30] Large-scale k-means clustering via variance reduction
Zhao, Yawei
Ming, Yuewei
Liu, Xinwang
Zhu, En
Zhao, Kaikai
Yin, Jianping
NEUROCOMPUTING, 2018, 307 : 184 - 194

← 1 2 3 4 5 →