Visual topic models for healthcare data clustering

被引:19
|
作者
Prasad, K. Rajendra [1 ]
Mohammed, Moulana [2 ]
Noorullah, R. M. [1 ,2 ]
机构
[1] Inst Aeronaut Engn, Dept Comp Sci & Engn, Hyderabad 500043, Telangana, India
[2] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Guntur 522502, Andhra Pradesh, India
关键词
Visual topic model; Social data; Visual clustering; Cosine based metric; Health tendency;
D O I
10.1007/s12065-019-00300-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media is a great source to search health-related topics for envisages solutions towards healthcare. Topic models originated from Natural Language Processing that is receiving much attention in healthcare areas because of interpretability and its decision making, which motivated us to develop visual topic models. Topic models are used for the extraction of health topics for analyzing discriminative and coherent latent features of tweet documents in healthcare applications. Discovering the number of topics in topic models is an important issue. Sometimes, users enable an incorrect number of topics in traditional topic models, which leads to poor results in health data clustering. In such cases, proper visualizations are essential to extract information for identifying cluster trends. To aid in the visualization of topic clouds and health tendencies in the document collection, we present hybrid topic modeling techniques by integrating traditional topic models with visualization procedures. We believe proposed visual topic models viz., Visual Non-Negative Matrix Factorization (VNMF), Visual Latent Dirichlet Allocation (VLDA), Visual intJNon-negative Matrix Factorization (VintJNMF), and Visual Probabilistic Latent Schematic Indexing (VPLSI) are promising methods for extracting tendency of health topics from various sources in healthcare data clustering. Standard and benchmark social health datasets are used in an experimental study to demonstrate the efficiency of proposed models concerning clustering accuracy (CA), Normalized Mutual Information (NMI), precision (P), recall (R), F-Score (F) measures and computational complexities. VNMF visual model performs significantly at an increased rate of 32.4% under cosine based metric in the display of visual clusters and an increased rate of 35-40% in performance measures compared to other visual methods on different number of health topics.
引用
收藏
页码:545 / 562
页数:18
相关论文
共 50 条
  • [41] Topic-based habitat classification using visual data
    Pizarro, Oscar
    Williams, Stefan B.
    Colquhoun, Jamie
    OCEANS 2009 - EUROPE, VOLS 1 AND 2, 2009, : 1320 - +
  • [42] An Axiomatic Inspection of the Behavior of Topic Models with Data Aggregation
    Deolalikar, Vinay
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [43] Probabilistic Topic Models for Text Data Retrieval and Analysis
    Zhai, ChengXiang
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1399 - 1401
  • [44] Optimizing Modality Weights in Topic Models of Transactional Data
    K. Ya. Khrylchenko
    K. V. Vorontsov
    Automation and Remote Control, 2022, 83 : 1908 - 1922
  • [45] Clustering for Visual Analogue Scale Data in Symbolic Data Analysis
    Katayama, Kotoe
    Yamaguchi, Rui
    Imoto, Seiya
    Matsuura, Keiko
    Watanabe, Kenji
    Miyano, Satoru
    COMPLEX ADAPTIVE SYSTEMS, 2011, 6
  • [46] Optimizing Modality Weights in Topic Models of Transactional Data
    Khrylchenko, K. Ya.
    Vorontsov, K. V.
    AUTOMATION AND REMOTE CONTROL, 2022, 83 (12) : 1908 - 1922
  • [47] Investigation of the Quality of Topic Models for Noisy Data Sources
    Geeganage, Dakshi T. Kapugamam
    Xu, Yue
    Li, Yuefeng
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 488 - 493
  • [48] Combining feature norms and text data with topic models
    Steyvers, Mark
    ACTA PSYCHOLOGICA, 2010, 133 (03) : 234 - 243
  • [49] An improved clustering method based on biological visual models
    Rodriguez, Alma
    Cuevas, Erik
    Zaldivar, Daniel
    Perez-Cisneros, Marco
    Garcia-Gil, Gerardo
    Morales-Castaneda, Bernardo
    APPLIED MATHEMATICAL MODELLING, 2020, 85 : 174 - 191
  • [50] Visual cluster validity for prototype generator clustering models
    Hathaway, RJ
    Bezdek, JC
    PATTERN RECOGNITION LETTERS, 2003, 24 (9-10) : 1563 - 1569