PSLDA: a novel supervised pseudo document-based topic model for short texts

被引：4

作者：

Sun, Mingtao ^{[1
]}

Zhao, Xiaowei ^{[2
]}

Lin, Jingjing ^{[3
]}

Jing, Jian ^{[2
]}

Wang, Deqing ^{[2
]}

Jia, Guozhu ^{[1
]}

机构：

[1] Beihang Univ, Sch Econ & Management, Beijing 100191, Peoples R China

[2] Beihang Univ, Sch Comp Sci, Beijing 100191, Peoples R China

[3] Beihang Univ, Sch Instrumentat & Optoelect Engn, Beijing 100191, Peoples R China

来源：

FRONTIERS OF COMPUTER SCIENCE | 2022年 / 16卷 / 06期

关键词：

supervised topic model; short text; pseudo-document;

D O I：

10.1007/s11704-021-0606-3

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Various kinds of online social media applications such as Twitter and Weibo, have brought a huge volume of short texts. However, mining semantic topics from short texts efficiently is still a challenging problem because of the sparseness of word-occurrence and the diversity of topics. To address the above problems, we propose a novel supervised pseudo-document-based maximum entropy discrimination latent Dirichlet allocation model (PSLDA for short). Specifically, we first assume that short texts are generated from the normal size latent pseudo documents, and the topic distributions are sampled from the pseudo documents. In this way, the model will reduce the sparseness of word-occurrence and the diversity of topics because it implicitly aggregates short texts to longer and higher-level pseudo documents. To make full use of labeled information in training data, we introduce labels into the model, and further propose a supervised topic model to learn the reasonable distribution of topics. Extensive experiments demonstrate that our proposed method achieves better performance compared with some state-of-the-art methods.

引用

页数：10

共 50 条

[31] A Multilevel Clustering Model for Coherent Topic Discovery in Short Texts
Maithya, Emmanuel Muthoka
Nderu, Lawrence
Njagi, Dennis
2022 IST-AFRICA CONFERENCE, 2022,
[32] Robust Word-Network Topic Model for Short Texts
Wang, Fei
Liu, Rui
Zuo, Yuan
Zhang, Hui
Zhang, He
Wu, Junjie
2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 852 - 856
[33] A New Sentiment and Topic Model for Short Texts on Social Media
Xu, Kang
Huang, Junheng
Qi, Guilin
SEMANTIC TECHNOLOGY, JIST 2017, 2017, 10675 : 183 - 198
[34] A Document-Based Neural Relevance Model for Effective Clinical Decision Support
Ran, Yanhua
He, Ben
Hui, Kai
Xu, Jungang
Sun, Le
2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 798 - 804
[35] A Novel Neural Topic Model and Its Supervised Extension
Cao, Ziqiang
Li, Sujian
Liu, Yang
Li, Wenjie
Ji, Heng
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2210 - 2216
[36] Combined document embedding and hierarchical topic model for social media texts analysis
Uteuov, Amir
Kalyuzhnaya, Anna
7TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE ON COMPUTATIONAL SCIENCE, YSC2018, 2018, 136 : 293 - 303
[37] Comparative Analysis between Document-based and Model-based Compliance Management Approaches
Ghanavati, Sepideh
Amyot, Daniel
Peyton, Liam
RELAW: 2008 REQUIREMENTS ENGINEERING AND LAW, 2008, : 39 - 43
[38] A topic-based document correlation model
Jia, Xi-Ping
Peng, Hong
Zheng, Qj-Lun
Jiang, Zhuo-Lin
Li, Zhao
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2487 - 2491
[39] A Topic based Document Relevance Ranking Model
Gao, Yang
Xu, Yue
Li, Yuefeng
WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 271 - 272
[40] Neural labeled LDA: a topic model for semi-supervised document classification
Wang, Wei
Guo, Bing
Shen, Yan
Yang, Han
Chen, Yaosen
Suo, Xinhua
SOFT COMPUTING, 2021, 25 (23) : 14561 - 14571

← 1 2 3 4 5 →