Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis

被引:1
|
作者
Rashid, Junaid [1 ]
Kim, Jungeun [2 ]
Naseem, Usman [3 ]
机构
[1] Sejong Univ, Dept Data Sci, Seoul, South Korea
[2] Kongju Natl Univ, Dept Software, Cheonan, South Korea
[3] Univ Sydney, Sch Comp Sci, Sydney, NSW, Australia
基金
新加坡国家研究基金会;
关键词
Topic Modeling; Clustering; Short Text; Classification; Coherence;
D O I
10.1145/3543873.3587316
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growing abundance of short text content on websites, analyzing and comprehending these short texts has become a crucial task. Topic modeling is a widely used technique for analyzing short text documents and uncovering the underlying topics. However, traditional topic models face difficulties in accurately extracting topics from short texts due to limited content and their sparse nature. To address these issues, we propose an Embedding-based topic modeling (EmTM) approach that incorporates word embedding and hierarchical clustering to identify significant topics. Experimental results demonstrate the effectiveness of EmTM on two datasets comprising web short texts, Snippet and News. The results indicate a superiority of EmTM over baseline topic models by its exceptional performance in both classification accuracy and topic coherence metrics.
引用
收藏
页码:73 / 76
页数:4
相关论文
共 50 条
  • [1] Incorporating structural topic modeling into short text analysis
    Wang, Po-Ya Angela
    Hsieh, Shu-Kai
    CONCENTRIC-STUDIES IN LINGUISTICS, 2023, 49 (01) : 96 - 138
  • [2] Incorporating word embeddings into topic modeling of short text
    Gao, Wang
    Peng, Min
    Wang, Hua
    Zhang, Yanchun
    Xie, Qianqian
    Tian, Gang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 61 (02) : 1123 - 1145
  • [3] Incorporating word embeddings into topic modeling of short text
    Wang Gao
    Min Peng
    Hua Wang
    Yanchun Zhang
    Qianqian Xie
    Gang Tian
    Knowledge and Information Systems, 2019, 61 : 1123 - 1145
  • [4] Probabilistic topic modeling for short text based on word embedding networks
    Pita, Marcelo
    Nunes, Matheus
    Pappa, Gisele L.
    APPLIED INTELLIGENCE, 2022, 52 (15) : 17829 - 17844
  • [5] Probabilistic topic modeling for short text based on word embedding networks
    Marcelo Pita
    Matheus Nunes
    Gisele L. Pappa
    Applied Intelligence, 2022, 52 : 17829 - 17844
  • [6] Spatial Temporal Topic Embedding: A Semantic Modeling Method for Short Text in Social Network
    Yang, Congxian
    Du, Junping
    Kou, Feifei
    Lee, Jangmyung
    ARTIFICIAL INTELLIGENCE (ICAI 2018), 2018, 888 : 198 - 210
  • [7] ULW-DMM: An Effective Topic Modeling Method for Microblog Short Text
    Yu, Jia
    Qiu, Lirong
    IEEE ACCESS, 2019, 7 : 884 - 893
  • [8] Incorporating Word Embedding into Cross-lingual Topic Modeling
    Chang, Chia-Hsuan
    Hwang, San-Yih
    Xui, Tou-Hsiang
    2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 17 - 24
  • [9] STTM: A tool for short text topic modeling
    Qiang, Jipeng
    Li, Yun
    Yuan, Yunhao
    Liu, Wei
    Wu, Xindong
    arXiv, 2018,
  • [10] Short Text Embedding for Clustering based on Word and Topic Semantic Information
    Chen, Ziheng
    Ren, Jiangtao
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 61 - 70