A Partially Supervised Cross-Collection Topic Model for Cross-Domain Text Classification

被引:27
|
作者
Bao, Yang [1 ]
Collier, Nigel [2 ]
Datta, Anindya [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[2] Natl Inst Informat, Tokyo 1018430, Japan
关键词
Topic Modeling; LDA; Cross-Domain Learning; Text Classification;
D O I
10.1145/2505515.2505556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-domain text classification aims to automatically train a precise text classifier for a target domain by using labelled text data from a related source domain. To this end, one of the most promising ideas is to induce a new feature representation so that the distributional difference between domains can be reduced and a more accurate classifier can be learned in this new feature space. However, most existing methods do not explore the duality of the marginal distribution of examples and the conditional distribution of class labels given labeled training examples in the source domain. Besides, few previous works attempt to explicitly distinguish the domain-independent and domain-specific latent features and align the domain-specific features to further improve the cross-domain learning. In this paper, we propose a model called Partially Supervised Cross-Collection LDA topic model (PSCCLDA) for cross-domain learning with the purpose of addressing these two issues in a unified way. Experimental results on nine datasets show that our model outperforms two standard classifiers and four state-of-the-art methods, which demonstrates the effectiveness of our proposed model.
引用
下载
收藏
页码:239 / 247
页数:9
相关论文
共 50 条
  • [31] Cross-Domain Collaborative Filtering with Review Text
    Xin, Xin
    Liu, Zhirun
    Lin, Chin-Yew
    Huang, Heyan
    Wei, Xiaochi
    Guo, Ping
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1827 - 1833
  • [32] Generating Cross-Domain Text Classification Corpora from Social Media Comments
    Murauer, Benjamin
    Specht, Guenther
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2019), 2019, 11696 : 114 - 125
  • [33] Exploring Cross-Domain Pretrained Model for Hyperspectral Image Classification
    Lee, Hyungtae
    Eum, Sungmin
    Kwon, Heesung
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] Cross-domain text classification algorithm based on instance-transfer learning
    Liu, Ruijun
    Wang, Jun
    Yu, Zhuo
    Shi, Yuqian
    Zhang, Lun
    Ji, Changjiang
    Jin, Xin
    INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2020, 2020, 11574
  • [35] Iterative Refining of Category Profiles for Nearest Centroid Cross-Domain Text Classification
    Domeniconi, Giacomo
    Moro, Gianluca
    Pasolini, Roberto
    Sartori, Claudio
    KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, IC3K 2014, 2015, 553 : 50 - 67
  • [36] Using Wikipedia for Co-clustering Based Cross-domain Text Classification
    Wang, Pu
    Domeniconi, Carlotta
    Hu, Jian
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 1085 - +
  • [37] Alleviating the medical strain: a triage method via cross-domain text classification
    Xiao, Xiao
    Wang, Shuqin
    Jiang, Feng
    Qi, Tingyue
    Wang, Wei
    Frontiers in Computational Neuroscience, 2024, 18
  • [38] TEI Analytics: converting documents into a TEI format for cross-collection text analysis
    Zillig, Brian L. Pytlik
    LITERARY AND LINGUISTIC COMPUTING, 2009, 24 (02): : 187 - 192
  • [39] Enriching Topic Coherence on Reviews for Cross-Domain Recommendation
    Saraswat, Mala
    Chakraverty, Shampa
    COMPUTER JOURNAL, 2022, 65 (01): : 80 - 90
  • [40] Cross-Domain Traffic Scene Understanding by Integrating Deep Learning and Topic Model
    Yang, Yuanfeng
    Dong, Husheng
    Liu, Gang
    Zhang, Liang
    Li, Lin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022