A hierarchical topic modelling approach for short text clustering

被引:0
|
作者
Pradhan R. [1 ]
Sharma D.K. [1 ]
机构
[1] GLA University, UP, Mathura
关键词
Dirichlet multinomial mixture; DMM; short text clustering; STT; topic modelling; Twitter topic modelling;
D O I
10.1504/IJICT.2022.123161
中图分类号
学科分类号
摘要
Social networking websites such as Twitter and WeChat provide services for microblogging to its users; they post millions of short messages on it every day. Creating a dataset of these messages helps in solving many non-trivial tasks in the domain of computer science, natural language processing, opinion mining, and many more. Topic modelling is critical in understanding the tweets and segregate then into manageable sets. We are bringing the topic modelling approaches to cluster the tweets or short text messages to groups as conventional approaches fail to properly deal with noisy, high volume, dimensionality, and short text sparseness. The method we have proposed can deal with the issue of data sparsity of short text. Our method involves a hierarchical two-stage clustering method. We have analysed the results on standard datasets, and we find that our method had better results as compared to other methods. Copyright © 2022 Inderscience Enterprises Ltd.
引用
收藏
页码:463 / 481
页数:18
相关论文
共 50 条
  • [1] An Approach of Hierarchical Concept Clustering on Medical Short Text Corpus
    Li, Wei
    Zhao, Dazhe
    Yang, Jinzhu
    Cao, Longbing
    PROCEEDINGS OF THE 2013 6TH INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2013), VOLS 1 AND 2, 2013, : 509 - 518
  • [2] Advanced Hierarchical Topic Labeling for Short Text
    Tiwari, Paras
    Tripathi, Ashutosh
    Singh, Avaneesh
    Rai, Sawan
    IEEE ACCESS, 2023, 11 : 35158 - 35174
  • [3] Topic attention encoder: A self-supervised approach for short text clustering
    Jin, Jian
    Zhao, Haiyuan
    Ji, Ping
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (05) : 701 - 717
  • [4] An Approach to Fuzzy Hierarchical Clustering of Short Text Fragments Based on Fuzzy Graph Clustering
    Dudarin, Pavel V.
    Yarushkina, Nadezhda G.
    PROCEEDINGS OF THE SECOND INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'17), VOL 1, 2018, 679 : 295 - 304
  • [5] Short text optimized topic model for service clustering
    Lu J.-W.
    Zheng J.-H.
    Li D.-N.
    Xu J.
    Xiao G.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (12): : 2416 - 2425+2444
  • [6] Topic Based Temporal Generative Short Text Clustering
    Smitha, E. S.
    Sendhilkumar, S.
    Mahalakshmi, G. S.
    Sanju, S. Krithika
    PROCEEDING OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS, BIG DATA AND IOT (ICCBI-2018), 2020, 31 : 912 - 922
  • [7] Discovering Topic Representative Terms for Short Text Clustering
    Yang, Shuiqiao
    Huang, Guangyan
    Cai, Borui
    IEEE ACCESS, 2019, 7 : 92037 - 92047
  • [8] Hierarchical Clustering Approach to Text Compression
    Oswald, C.
    Vyas, V. Akshay
    Kumar, K. Arun
    Sri, L. Vijay
    Sivaselvan, B.
    PROGRESS IN INTELLIGENT COMPUTING TECHNIQUES: THEORY, PRACTICE, AND APPLICATIONS, VOL 1, 2018, 518 : 347 - 357
  • [9] Corpus-based topic diffusion for short text clustering
    Zheng, Chu Tao
    Liu, Cheng
    Wong, Hau San
    NEUROCOMPUTING, 2018, 275 : 2444 - 2458
  • [10] A Novel Approach of Neural Topic Modelling for Document Clustering
    Subramani, Sandhya
    Sridhar, Vaishnavi
    Shetty, Kaushal
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 2169 - 2173