Topic modeling methods for short texts: A survey

被引:1
|
作者
Fan, Yuwei [1 ]
Shi, Lei [1 ,2 ]
Yuan, Lu [1 ]
机构
[1] Commun Univ China, State Key Lab Media Convergence & Commun, Beijing, Peoples R China
[2] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin, Peoples R China
关键词
Short text; probabilistic topic model; neural topic model; word embeddings; deep learning; CLASSIFICATION;
D O I
10.3233/JIFS-223834
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the present day, online users are incentivized to engage in short text-based communication. These short texts harbor a significant amount of implicit information, including opinions, topics, and emotions, which are of notable value for both exploration and analysis. By alleviating the sparsity in short texts, topic models can be used to discover topics from large collections of short texts. While there is a large body of surveys focused on topic modeling, but only a few of them have focused on the short texts. This paper presents a comprehensive overview of topic modeling methods for short texts from a novel perspective. Firstly, it discusses short text probabilistic topic models and outlines the directions in which they can be improved. Secondly, it explores short text neural topic models, which can be categorized into three groups based on their underlying structures. In addition, this paper provides a detailed investigation of embedding methods in topic modeling. Moreover, various applications and corresponding works are surveyed, with a focus on short texts. The commonly used public corpora and evaluation indicators for topic modeling are also summarized. Finally, the advantages and disadvantages of short text topic modeling are discussed in detail, and future research directions are proposed.
引用
收藏
页码:1971 / 1990
页数:20
相关论文
共 50 条
  • [1] Online Topic Modeling for Short Texts
    Roy, Suman
    Malladi, Vijay Varma
    Sengupta, Ayan
    Das, Souparna
    SERVICE-ORIENTED COMPUTING (ICSOC 2020), 2020, 12571 : 563 - 579
  • [2] BTM: Topic Modeling over Short Texts
    Cheng, Xueqi
    Yan, Xiaohui
    Lan, Yanyan
    Guo, Jiafeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) : 2928 - 2941
  • [3] SBTM: Topic Modeling over Short Texts
    Pang, Jianhui
    Li, Xiangsheng
    Xie, Haoran
    Rao, Yanghui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2016, 2016, 9645 : 43 - 56
  • [4] Topic Modeling for Short Texts with Large Language Models
    Doi, Tomoki
    Isonuma, Masaru
    Yanaka, Hitomi
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 4: STUDENT RESEARCH WORKSHOP, 2024, : 21 - 33
  • [5] Targeted aspects oriented topic modeling for short texts
    Jin He
    Lei Li
    Yan Wang
    Xindong Wu
    Applied Intelligence, 2020, 50 : 2384 - 2399
  • [6] Multiple Relational Topic Modeling for Noisy Short Texts
    Liu, Zheng
    Liu, Chiyu
    Xia, Bin
    Li, Tao
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2018, 28 (11-12) : 1559 - 1574
  • [7] Targeted aspects oriented topic modeling for short texts
    He, Jin
    Li, Lei
    Wang, Yan
    Wu, Xindong
    APPLIED INTELLIGENCE, 2020, 50 (08) : 2384 - 2399
  • [8] Topic Modeling for Short Texts with Auxiliary Word Embeddings
    Li, Chenliang
    Wang, Haoran
    Zhang, Zhiqian
    Sun, Aixin
    Ma, Zongyang
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 165 - 174
  • [9] Automatic Topic Modeling for Single Document Short Texts
    Sajid, Anamta
    Jan, Sadaqat
    Shah, Ibrar A.
    2017 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT), 2017, : 70 - 75
  • [10] Modeling Topic Evolution in Social Media Short Texts
    Zhang, Yuhao
    Mao, Wenji
    Lin, Junjie
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (IEEE ICBK 2017), 2017, : 315 - 319