Integration of Neural Embeddings and Probabilistic Models in Topic Modeling

被引:0
|
作者
Koochemeshkian, Pantea [1 ]
Bouguila, Nizar [1 ]
机构
[1] Concordia Inst Informat Syst Engn CIISE, Informat Syst Engn, Montreal, PQ, Canada
关键词
DIRICHLET; EXTRACTION;
D O I
10.1080/08839514.2024.2403904
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling, a way to find topics in large volumes of text, has grown with the help of deep learning. This paper presents two novel approaches to topic modeling by integrating embeddings derived from Bert-Topic with the multi-grain clustering topic model (MGCTM). Recognizing the inherent hierarchical and multi-scale nature of topics in corpora, our methods utilize MGCTM to capture topic structures at multiple levels of granularity. We enhance the expressiveness of MGCTM by introducing the Generalized Dirichlet and Beta-Liouville distributions as priors, which provide greater flexibility in modeling topic proportions and capturing richer topic relationships. Comprehensive experiments on various datasets showcase the effectiveness of our proposed models in achieving superior topic coherence and granularity compared to state-of-the-art methods. Our findings underscore the potential of leveraging hybrid architectures, marrying neural embeddings with advanced probabilistic modeling, to push the boundaries of topic modeling.
引用
收藏
页数:33
相关论文
共 50 条
  • [21] Incorporating word embeddings into topic modeling of short text
    Wang Gao
    Min Peng
    Hua Wang
    Yanchun Zhang
    Qianqian Xie
    Gang Tian
    Knowledge and Information Systems, 2019, 61 : 1123 - 1145
  • [22] Topic Modeling for Short Texts with Auxiliary Word Embeddings
    Li, Chenliang
    Wang, Haoran
    Zhang, Zhiqian
    Sun, Aixin
    Ma, Zongyang
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 165 - 174
  • [23] Cross-lingual embeddings with auxiliary topic models
    Zhou, Dong
    Peng, Xiaoya
    Li, Lin
    Han, Jun-mei
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 190
  • [24] Do Neural Topic Models Really Need Dropout? Analysis of the Effect of Dropout in Topic Modeling
    Adhya, Suman
    Lahiri, Avishek
    Sanyal, Debarshi Kumar
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2220 - 2229
  • [25] Towards Better Understanding with Uniformity and Explicit Regularization of Embeddings in Embedding-based Neural Topic Models
    Shao, Wei
    Huang, Lei
    Liu, Shuqi
    Ma, Shihua
    Song, Linqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [26] Steerable Neural Topic Modeling
    Fan, Qiuchen
    Li, Jie
    17TH INTERNATIONAL SYMPOSIUM ON VISUAL INFORMATION COMMUNICATION AND INTERACTION, VINCI 2024, 2024,
  • [27] Recent Advances and Applications of Probabilistic Topic Models
    Wood, Ian
    BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, MAXENT 2013, 2014, 1636 : 124 - 130
  • [28] PAINTING ANALYSIS USINGWAVELETS AND PROBABILISTIC TOPIC MODELS
    Wu, Tong
    Polatkan, Gungor
    Steel, David
    Brown, William
    Daubechies, Ingrid
    Calderbank, Robert
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3264 - 3268
  • [29] Probabilistic Topic Modeling for Genomic Data Interpretation
    Chen, Xin
    Hu, Xiaohua
    Shen, Xiajiong
    Rosen, Gail
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 149 - 152
  • [30] Adaptive Topic Modeling with Probabilistic Pseudo Feedback in Online Topic Detection
    Tang, Guoyu
    Xia, Yunqing
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 100 - 108