Visual topic models for healthcare data clustering

被引:19
|
作者
Prasad, K. Rajendra [1 ]
Mohammed, Moulana [2 ]
Noorullah, R. M. [1 ,2 ]
机构
[1] Inst Aeronaut Engn, Dept Comp Sci & Engn, Hyderabad 500043, Telangana, India
[2] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Guntur 522502, Andhra Pradesh, India
关键词
Visual topic model; Social data; Visual clustering; Cosine based metric; Health tendency;
D O I
10.1007/s12065-019-00300-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media is a great source to search health-related topics for envisages solutions towards healthcare. Topic models originated from Natural Language Processing that is receiving much attention in healthcare areas because of interpretability and its decision making, which motivated us to develop visual topic models. Topic models are used for the extraction of health topics for analyzing discriminative and coherent latent features of tweet documents in healthcare applications. Discovering the number of topics in topic models is an important issue. Sometimes, users enable an incorrect number of topics in traditional topic models, which leads to poor results in health data clustering. In such cases, proper visualizations are essential to extract information for identifying cluster trends. To aid in the visualization of topic clouds and health tendencies in the document collection, we present hybrid topic modeling techniques by integrating traditional topic models with visualization procedures. We believe proposed visual topic models viz., Visual Non-Negative Matrix Factorization (VNMF), Visual Latent Dirichlet Allocation (VLDA), Visual intJNon-negative Matrix Factorization (VintJNMF), and Visual Probabilistic Latent Schematic Indexing (VPLSI) are promising methods for extracting tendency of health topics from various sources in healthcare data clustering. Standard and benchmark social health datasets are used in an experimental study to demonstrate the efficiency of proposed models concerning clustering accuracy (CA), Normalized Mutual Information (NMI), precision (P), recall (R), F-Score (F) measures and computational complexities. VNMF visual model performs significantly at an increased rate of 32.4% under cosine based metric in the display of visual clusters and an increased rate of 35-40% in performance measures compared to other visual methods on different number of health topics.
引用
收藏
页码:545 / 562
页数:18
相关论文
共 50 条
  • [21] Visual analytics for the clustering capability of data
    LU ZhiMao
    LIU Chen
    ZHANG Qi
    ZHANG ChunXiang
    FAN DongMei
    YANG Peng
    Science China(Information Sciences), 2013, 56 (05) : 131 - 144
  • [22] Visual analytics for the clustering capability of data
    Lu ZhiMao
    Liu Chen
    Zhang Qi
    Zhang ChunXiang
    Fan DongMei
    Yang Peng
    SCIENCE CHINA-INFORMATION SCIENCES, 2013, 56 (05) : 1 - 14
  • [23] Visual analytics for the clustering capability of data
    ZhiMao Lu
    Chen Liu
    Qi Zhang
    ChunXiang Zhang
    DongMei Fan
    Peng Yang
    Science China Information Sciences, 2013, 56 : 1 - 14
  • [24] Exploitation of Clustering Techniques in Transactional Healthcare Data
    Mahoto, Naeem Ahmed
    Shaikh, Faisal Karim
    Ansari, Abdul Qadir
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2014, 33 (01) : 87 - 102
  • [25] Topic Models for RFID Data Modeling and Localization
    Kennedy, T. F.
    Provence, Robert S.
    Broyan, James L.
    Fink, Patrick W.
    Ngo, Phong H.
    Rodriguez, Lazaro D.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1438 - 1445
  • [26] Topic Models Vs. Unstructured Data
    Anthes, Gary
    COMMUNICATIONS OF THE ACM, 2010, 53 (12) : 16 - 18
  • [27] Application of dynamic topic models to toxicogenomics data
    Lee, Mikyung
    Liu, Zhichao
    Huang, Ruili
    Tong, Weida
    BMC BIOINFORMATICS, 2016, 17
  • [28] Application of dynamic topic models to toxicogenomics data
    Mikyung Lee
    Zhichao Liu
    Ruili Huang
    Weida Tong
    BMC Bioinformatics, 17
  • [29] Ontology-based Topic Clustering for Online Discussion Data
    Wang, Yongheng
    Cao, Kening
    Zhang, Xiaoming
    INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [30] Topic-Based Hard Clustering of Documents Using Generative Models
    Ponti, Giovanni
    Tagarelli, Andrea
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2009, 5722 : 231 - 240