A hierarchical topic modelling approach for short text clustering

被引:0
|
作者
Pradhan R. [1 ]
Sharma D.K. [1 ]
机构
[1] GLA University, UP, Mathura
关键词
Dirichlet multinomial mixture; DMM; short text clustering; STT; topic modelling; Twitter topic modelling;
D O I
10.1504/IJICT.2022.123161
中图分类号
学科分类号
摘要
Social networking websites such as Twitter and WeChat provide services for microblogging to its users; they post millions of short messages on it every day. Creating a dataset of these messages helps in solving many non-trivial tasks in the domain of computer science, natural language processing, opinion mining, and many more. Topic modelling is critical in understanding the tweets and segregate then into manageable sets. We are bringing the topic modelling approaches to cluster the tweets or short text messages to groups as conventional approaches fail to properly deal with noisy, high volume, dimensionality, and short text sparseness. The method we have proposed can deal with the issue of data sparsity of short text. Our method involves a hierarchical two-stage clustering method. We have analysed the results on standard datasets, and we find that our method had better results as compared to other methods. Copyright © 2022 Inderscience Enterprises Ltd.
引用
收藏
页码:463 / 481
页数:18
相关论文
共 50 条
  • [31] Hierarchical Topic Modeling for Urdu Text Articles
    Rehman, Anwar Ur
    Khan, Ali Haider
    Aftab, Mustansar
    Rehman, Zobia
    Shah, Munam Ali
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 464 - 469
  • [32] MapReduce-based approach on short text conversation clustering
    Zhang, Y. (zyszjhz@163.com), 1600, Binary Information Press (10):
  • [33] Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis
    Murshed, Belal Abdullah Hezam
    Mallappa, Suresha
    Abawajy, Jemal
    Saif, Mufeed Ahmed Naji
    Al-ariki, Hasib Daowd Esmail
    Abdulwahab, Hudhaifa Mohammed
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (06) : 5133 - 5260
  • [34] Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis
    Belal Abdullah Hezam Murshed
    Suresha Mallappa
    Jemal Abawajy
    Mufeed Ahmed Naji Saif
    Hasib Daowd Esmail Al-ariki
    Hudhaifa Mohammed Abdulwahab
    Artificial Intelligence Review, 2023, 56 : 5133 - 5260
  • [35] FORUM TOPIC DETECTION BASED ON HIERARCHICAL CLUSTERING
    Li, Hui
    Li, Qing
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 529 - 533
  • [36] Approach to text topic identification
    Zhu, Jing-Bo
    Yao, Tian-Shun
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2002, 23 (05): : 425 - 427
  • [37] Social Media Mining: A Genetic Based Multiobjective Clustering Approach to Topic Modelling
    Alfred, Rayner
    Jie, Loo Yew
    Obit, Joe Henry
    Lim, Yuto
    Haviluddin, Haviluddin
    Azman, Azreen
    IAENG International Journal of Computer Science, 2021, 48 (01)
  • [38] Topic discovery method based on topic model combined with hierarchical clustering
    Wang, An
    Zhang, Junjie
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 814 - 818
  • [39] A self-supervised seed-driven approach to topic modelling and clustering
    Ravenda, Federico
    Bahrainian, Seyed Ali
    Raballo, Andrea
    Mira, Antonietta
    Crestani, Fabio
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, : 333 - 353
  • [40] Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums
    Bouabdallaoui, Ibrahim
    Guerouate, Fatima
    Sbihi, Mohammed
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2024, 13