Supervised Topic Models for Microblog Classification

被引:9
|
作者
Kataria, Saurabh [1 ]
Agarwal, Arvind [1 ]
机构
[1] Palo Alto Res Ctr, Webster, NY 14580 USA
关键词
D O I
10.1109/ICDM.2015.148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a topic model based approach for classifying micro-blog posts into a given topics of interests. The short nature of micro-blog posts make them challenging for directly learning a classification model. To overcome this limitation, we use content of the links embedded in these posts to improve the topic learning. The hypothesis is that since the link content is far richer than the content of the post itself, using link content along with the content of the post will help learning. However, how this link content can be used to construct features for classification remains a challenging issue. Furthermore, in previous methods, user based information is utilized in an ad-hoc manner that only work for certain type of classification, such as characterizing content of microblogs. In this paper, we propose supervised topic model, User-Labeled-LDA and its nonparametric variant that can avoid the ad-hoc feature construction task and model the topics in a discriminative way. Our experiments on a Twitter dataset shows that modeling user interests and link information helps in learning quality topics for sparse tweets as well as helps significantly in classification task. Our experiments further show that modeling this information in a principled way through topic models helps more than simply adding this information through features.
引用
收藏
页码:793 / 798
页数:6
相关论文
共 50 条
  • [21] Semi-supervised Multi-Label Topic Models for Document Classification and Sentence Labeling
    Soleimani, Hossein
    Miller, David J.
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 105 - 114
  • [22] TweetLDA: Supervised Topic Classification and Link Prediction in Twitter
    Quercia, Daniele
    Askham, Harry
    Crowcroft, Jon
    PROCEEDINGS OF THE 3RD ANNUAL ACM WEB SCIENCE CONFERENCE, 2012, 2012, : 247 - 250
  • [23] Topic Labeled Text Classification: A Weakly Supervised Approach
    Hingmire, Swapnil
    Chakraborti, Sutanu
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 385 - 394
  • [24] Supervised Topic Classification for Modeling a Hierarchical Conference Structure
    Kuznetsov, Mikhail
    Clausel, Marianne
    Amini, Massih-Reza
    Gaussier, Eric
    Strijov, Vadim
    NEURAL INFORMATION PROCESSING, PT I, 2015, 9489 : 90 - 97
  • [25] Robust supervised topic models under label noise
    Wang, Wei
    Guo, Bing
    Shen, Yan
    Yang, Han
    Chen, Yaosen
    Suo, Xinhua
    MACHINE LEARNING, 2021, 110 (05) : 907 - 931
  • [26] Robust supervised topic models under label noise
    Wei Wang
    Bing Guo
    Yan Shen
    Han Yang
    Yaosen Chen
    Xinhua Suo
    Machine Learning, 2021, 110 : 907 - 931
  • [27] A Topic Detection Method for Chinese Microblog
    Xie, Jing
    Liu, Gongshen
    Ning, Wei
    2012 INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING (ISISE), 2012, : 100 - 103
  • [28] Evaluating Supervised Topic Models in the Presence of OCR Errors
    Walker, Daniel
    Ringger, Eric
    Seppi, Kevin
    DOCUMENT RECOGNITION AND RETRIEVAL XX, 2013, 8658
  • [29] LDA topic model for microblog recommendation
    Duan, Jianyong
    Ai, Yamin
    Ii, Xia
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2015, : 185 - 188
  • [30] Bayesian Bridging Topic Models for Classification
    Wu, Meng-Sung
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (05) : 1585 - 1600