Learning to share by masking the non-shared for multi-domain sentiment classification

被引:8
|
作者
Yuan, Jianhua [1 ]
Zhao, Yanyan [1 ]
Qin, Bing [1 ,2 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin 150001, Peoples R China
[2] Pengcheng Lab, Shenzhen 518066, Peoples R China
基金
中国国家自然科学基金;
关键词
Natural language processing; Sentiment analysis; Cross domain; Masking;
D O I
10.1007/s13042-022-01556-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-domain sentiment classification deals with the scenario where labeled data exists for multiple domains but is insufficient for training effective sentiment classifiers that work across domains. Thus, fully exploiting sentiment knowledge shared across domains is crucial for real-world applications. While many existing works try to extract domain-invariant features in high-dimensional space, such models fail to explicitly distinguish between shared and private features at the text level, which to some extent lacks interpretability. Based on the assumption that removing domain-related tokens from texts would help improve their domain invariance, we instead first transform original sentences to be domain-agnostic. To this end, we propose the BERTMasker model which explicitly masks domain-related words from texts, learns domain-invariant sentiment features from these domain-agnostic texts and uses those masked words to form domain-aware sentence representations. Empirical experiments on the benchmark multiple domain sentiment classification datasets demonstrate the effectiveness of our proposed model, which improves the accuracy on multi-domain and cross-domain settings by 1.91% and 3.31% respectively. Further analysis on masking proves that removing those domain-related and sentiment irrelevant tokens decreases texts' domain separability, resulting in the performance degradation of a BERT-based domain classifier by over 12%.
引用
下载
收藏
页码:2711 / 2724
页数:14
相关论文
共 50 条
  • [21] Informal Multilingual Multi-domain Sentiment Analysis
    Stajner, Tadej
    Novalija, Inna
    Mladenic, Dunja
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2013, 37 (04): : 373 - 380
  • [22] Learning Multi-Domain Adversarial Neural Networks for Text Classification
    Ding, Xiao
    Shi, Qiankun
    Cai, Bibo
    Liu, Ting
    Zhao, Yanyan
    Ye, Qiang
    IEEE ACCESS, 2019, 7 : 40323 - 40332
  • [23] DaCon: Multi-Domain Text Classification Using Domain Adversarial Contrastive Learning
    Dai, Yingjun
    El-Roby, Ahmed
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT V, 2023, 14258 : 40 - 52
  • [24] Multi-Task Multi-Domain Learning for Digital Staining and Classification of Leukocytes
    Tomczak, Agnieszka
    Ilic, Slobodan
    Marquardt, Gaby
    Engel, Thomas
    Forster, Frank
    Navab, Nassir
    Albarqouni, Shadi
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (10) : 2897 - 2910
  • [25] HMDSAD: Hindi Multi-Domain Sentiment Aware Dictionary
    Jha, Vandana
    Savitha, R.
    Hebbar, Sudhashri S.
    Shenoy, P. Deepa
    Venugopal, K. R.
    2015 INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORK COMMUNICATIONS (COCONET), 2015, : 241 - 247
  • [26] MARSA: Multi-Domain Arabic Resources for Sentiment Analysis
    Alowisheq, Areeb
    Al-Twairesh, Nora
    Altuwaijri, Mawaheb
    Almoammar, Afnan
    Alsuwailem, Alhanouf
    Albuhairi, Tarfa
    Alahaideb, Wejdan
    Alhumoud, Sarah
    IEEE ACCESS, 2021, 9 : 142718 - 142728
  • [27] Sentiment Analysis Using Deep Learning Approaches on Multi-Domain Dataset in Telugu Language
    Chattu, Kannaiah
    Sumathi, D.
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2024, 23 (03)
  • [28] An algorithm for multi-domain website classification
    Ullah M.A.
    Tahrin A.
    Marjan S.
    International Journal of Web-Based Learning and Teaching Technologies, 2020, 15 (04) : 57 - 65
  • [29] Semi-supervised Multi-domain Learning for Medical Image Classification
    Chavhan, Ruchika
    Banerjee, Biplab
    Das, Nibaran
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION, RTIP2R 2022, 2023, 1704 : 22 - 33
  • [30] Fuzzy Semantic Classification of Multi-Domain E-Learning Concept
    Ahmed, Rafeeq
    Ahmad, Tanvir
    Almutairi, Fadiyah M.
    Qahtani, Abdulrahman M.
    Alsufyani, Abdulmajeed
    Almutiry, Omar
    MOBILE NETWORKS & APPLICATIONS, 2021, 26 (05): : 2206 - 2215