Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis

被引:1
|
作者
Rashid, Junaid [1 ]
Kim, Jungeun [2 ]
Naseem, Usman [3 ]
机构
[1] Sejong Univ, Dept Data Sci, Seoul, South Korea
[2] Kongju Natl Univ, Dept Software, Cheonan, South Korea
[3] Univ Sydney, Sch Comp Sci, Sydney, NSW, Australia
基金
新加坡国家研究基金会;
关键词
Topic Modeling; Clustering; Short Text; Classification; Coherence;
D O I
10.1145/3543873.3587316
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growing abundance of short text content on websites, analyzing and comprehending these short texts has become a crucial task. Topic modeling is a widely used technique for analyzing short text documents and uncovering the underlying topics. However, traditional topic models face difficulties in accurately extracting topics from short texts due to limited content and their sparse nature. To address these issues, we propose an Embedding-based topic modeling (EmTM) approach that incorporates word embedding and hierarchical clustering to identify significant topics. Experimental results demonstrate the effectiveness of EmTM on two datasets comprising web short texts, Snippet and News. The results indicate a superiority of EmTM over baseline topic models by its exceptional performance in both classification accuracy and topic coherence metrics.
引用
收藏
页码:73 / 76
页数:4
相关论文
共 50 条
  • [21] Topic Modeling over Short Texts by Incorporating Word Embeddings
    Qiang, Jipeng
    Chen, Ping
    Wang, Tong
    Wu, Xindong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT II, 2017, 10235 : 363 - 374
  • [22] Incorporating Biterm Correlation Knowledge into Topic Modeling for Short Texts
    Zhang, Kai
    Zhou, Yuan
    Chen, Zheng
    Liu, Yufei
    Tang, Zhuo
    Yin, Li
    Chen, Jihong
    COMPUTER JOURNAL, 2022, 65 (03): : 537 - 553
  • [23] Sentiment Detection of Short Text via Probabilistic Topic Modeling
    Wu, Zewei
    Rao, Yanghui
    Li, Xin
    Li, Jun
    Xie, Haoran
    Wang, Fu Lee
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, 2015, 9052 : 76 - 85
  • [24] Short Text Topic Modeling Techniques, Applications, and Performance: A Survey
    Qiang, Jipeng
    Qian, Zhenyu
    Li, Yun
    Yuan, Yunhao
    Wu, Xindong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (03) : 1427 - 1445
  • [25] LDA-PSTR: A Topic Modeling Method for Short Text
    Zhou, Kai
    Yang, Qun
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2018, 2018, 11323 : 339 - 352
  • [26] Topic Modeling for Short Texts via Word Embedding and Document Correlation
    Yi, Feng
    Jiang, Bo
    Wu, Jianjun
    IEEE ACCESS, 2020, 8 : 30692 - 30705
  • [27] Efficient Correlated Topic Modeling with Topic Embedding
    He, Junxian
    Hu, Zhiting
    Berg-Kirkpatrick, Taylor
    Huang, Ying
    Xing, Eric P.
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 225 - 233
  • [28] Combine Topic Modeling with Semantic Embedding: Embedding Enhanced Topic Model
    Zhang, Peng
    Wang, Suge
    Li, Deyu
    Li, Xiaoli
    Xu, Zhikang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (12) : 2322 - 2335
  • [29] Topic Modeling in Embedding Spaces
    Dieng, Adji B.
    Ruiz, Francisco J. R.
    Blei, David M.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (439-453) : 439 - 453
  • [30] The promise of machine-learning- driven text analysis techniques for historical research: topic modeling and word embedding
    Martin, Marta Villamor
    Kirsch, David A.
    Prieto-Nanez, Fabian
    MANAGEMENT & ORGANIZATIONAL HISTORY, 2023, 18 (01) : 81 - 96