A Partially Supervised Cross-Collection Topic Model for Cross-Domain Text Classification

被引:27
|
作者
Bao, Yang [1 ]
Collier, Nigel [2 ]
Datta, Anindya [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[2] Natl Inst Informat, Tokyo 1018430, Japan
关键词
Topic Modeling; LDA; Cross-Domain Learning; Text Classification;
D O I
10.1145/2505515.2505556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-domain text classification aims to automatically train a precise text classifier for a target domain by using labelled text data from a related source domain. To this end, one of the most promising ideas is to induce a new feature representation so that the distributional difference between domains can be reduced and a more accurate classifier can be learned in this new feature space. However, most existing methods do not explore the duality of the marginal distribution of examples and the conditional distribution of class labels given labeled training examples in the source domain. Besides, few previous works attempt to explicitly distinguish the domain-independent and domain-specific latent features and align the domain-specific features to further improve the cross-domain learning. In this paper, we propose a model called Partially Supervised Cross-Collection LDA topic model (PSCCLDA) for cross-domain learning with the purpose of addressing these two issues in a unified way. Experimental results on nine datasets show that our model outperforms two standard classifiers and four state-of-the-art methods, which demonstrates the effectiveness of our proposed model.
引用
收藏
页码:239 / 247
页数:9
相关论文
共 50 条
  • [1] Cross-Domain Labeled LDA for Cross-Domain Text Classification
    Jing, Baoyu
    Lu, Chenwei
    Wang, Deqing
    Zhuang, Fuzhen
    Niu, Cheng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 187 - 196
  • [2] Cross-Domain Text Classification Based on BERT Model
    Zhang, Kuan
    Hei, Xinhong
    Fei, Rong
    Guo, Yufan
    Jiao, Rui
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 197 - 208
  • [3] Supervised Adaptive-transfer PLSA for Cross-Domain Text Classification
    Zhao, Rui
    Mao, Kezhi
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 259 - 266
  • [4] Cross-Domain Topic Classification for Political Texts
    Osnabruegge, Moritz
    Ash, Elliott
    Morelli, Massimo
    [J]. POLITICAL ANALYSIS, 2023, 31 (01): : 59 - 80
  • [5] A link-bridged topic model for cross-domain document classification
    Yang, Pei
    Gao, Wei
    Tan, Qi
    Wong, Kam-Fai
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (06) : 1181 - 1193
  • [6] Iterative Reinforcement Cross-Domain Text Classification
    Zhang, Di
    Xue, Gui-Rong
    Yu, Yong
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 282 - 293
  • [7] Cross-domain knowledge distillation for text classification
    Zhang, Shaokang
    Jiang, Lei
    Tan, Jianlong
    [J]. NEUROCOMPUTING, 2022, 509 : 11 - 20
  • [8] A text visualization method for cross-domain research topic mining
    Xinyi Jiang
    Jiawan Zhang
    [J]. Journal of Visualization, 2016, 19 : 561 - 576
  • [9] A text visualization method for cross-domain research topic mining
    Jiang, Xinyi
    Zhang, Jiawan
    [J]. JOURNAL OF VISUALIZATION, 2016, 19 (03) : 561 - 576
  • [10] Research Progress on Cross-domain Text Sentiment Classification
    Zhao C.-J.
    Wang S.-G.
    Li D.-Y.
    [J]. Zhao, Chuan-Jun (zhaochuanjun@foxmail.com), 1723, Chinese Academy of Sciences (31): : 1723 - 1746