Document representation based on probabilistic word clustering in customer-voice classification

被引:0
|
作者
Younghoon Lee
Seokmin Song
Sungzoon Cho
Jinhae Choi
机构
[1] Seoul National University,Department of Industrial Engineering and Institute for Industrial Systems Innovation
[2] LG Electronics,Data Driven User Experience Team, Mobile Communication Lab
来源
关键词
Probabilistic word clustering; Document representation; Customer-voice; Classification;
D O I
暂无
中图分类号
学科分类号
摘要
Customer-voice data have an important role in different fields including marketing, product planning, and quality assurance. However, owing to the manual processes involved, there are problems associated with the classification of customer-voice data. This study focuses on building automatic classifiers for customer-voice data with newly proposed document representation methods based on neural-embedding and probabilistic word-clustering approaches. Semantically similar terms are classified into a common cluster. The words generated from neural embedding are clustered according to the membership strength of each word relative to each cluster derived from a probabilistic clustering method such as the fuzzy C-means clustering method or Gaussian mixture model. It is expected that the proposed method can be suitable for the classification of customer-voice data consisting of unstructured text by considering the membership strength. The results demonstrate that the proposed method achieved an accuracy of 89.24% with respect to representational effectiveness and an accuracy of 87.76% with respect to the classification performance of customer-voice data consisting of 12 classes. Further, the method provided an intuitive interpretation for the generated representation.
引用
收藏
页码:221 / 232
页数:11
相关论文
共 50 条
  • [31] Automatic document classification based on probabilistic reasoning: Model and performance analysis
    Lam, W
    Low, KF
    SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 2719 - 2723
  • [32] Uncovering Document Fraud in Maritime Freight Transport Based on Probabilistic Classification
    Triepels, Ron
    Feelders, Ad
    Daniels, Hennie
    COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL MANAGEMENT, 2015, 9339 : 282 - 293
  • [33] Customer-Based Opinion Analysis Using Clustering and Classification Techniques
    Ahad, Abdul
    Riyazuddin
    Sadiq, Jaffar
    Raju, Basava
    Lakshmi, Rama
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, MACHINE LEARNING AND APPLICATIONS, VOL 1, ICDSMLA 2023, 2025, 1273 : 380 - 388
  • [34] Clustering and Classification Based on Distributed Automatic Feature Engineering for Customer Segmentation
    Lee, Zne-Jung
    Lee, Chou-Yuan
    Chang, Li-Yun
    Sano, Natsuki
    SYMMETRY-BASEL, 2021, 13 (09):
  • [35] Application of Parallel Clustering Algorithm Based on R in Power Customer Classification
    Pan, Sen
    Qiao, Junfeng
    Zhu, Lipeng
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 165 - 169
  • [36] Interpretable and reconfigurable clustering of document datasets by deriving word-based rules
    Vipin Balachandran
    Deepak Deepak P
    Knowledge and Information Systems, 2012, 32 : 475 - 503
  • [37] Interpretable and reconfigurable clustering of document datasets by deriving word-based rules
    Balachandran, Vipin
    Deepak, P.
    Khemani, Deepak
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (03) : 475 - 503
  • [38] Word-Map Systems for Content-Based Document Classification
    Tsimboukakis, Nikos
    Tambouratzis, George
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2011, 41 (05): : 662 - 673
  • [39] Representation Learning by Denoising Autoencoders for Clustering-based Classification
    Owhadi-Kareshk, Moein
    Akbarzadeh-T, Mohammad-R
    2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2015, : 228 - 233
  • [40] Boosting Discrimination Information Based Document Clustering Using Consensus and Classification
    Sheri, Ahmad Muqeem
    Rafique, Muhammad Aasim
    Hassan, Malik Tahir
    Junejo, Khurum Nazir
    Jeon, Moongu
    IEEE ACCESS, 2019, 7 : 78954 - 78962