Learning to share by masking the non-shared for multi-domain sentiment classification

被引:8
|
作者
Yuan, Jianhua [1 ]
Zhao, Yanyan [1 ]
Qin, Bing [1 ,2 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin 150001, Peoples R China
[2] Pengcheng Lab, Shenzhen 518066, Peoples R China
基金
中国国家自然科学基金;
关键词
Natural language processing; Sentiment analysis; Cross domain; Masking;
D O I
10.1007/s13042-022-01556-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-domain sentiment classification deals with the scenario where labeled data exists for multiple domains but is insufficient for training effective sentiment classifiers that work across domains. Thus, fully exploiting sentiment knowledge shared across domains is crucial for real-world applications. While many existing works try to extract domain-invariant features in high-dimensional space, such models fail to explicitly distinguish between shared and private features at the text level, which to some extent lacks interpretability. Based on the assumption that removing domain-related tokens from texts would help improve their domain invariance, we instead first transform original sentences to be domain-agnostic. To this end, we propose the BERTMasker model which explicitly masks domain-related words from texts, learns domain-invariant sentiment features from these domain-agnostic texts and uses those masked words to form domain-aware sentence representations. Empirical experiments on the benchmark multiple domain sentiment classification datasets demonstrate the effectiveness of our proposed model, which improves the accuracy on multi-domain and cross-domain settings by 1.91% and 3.31% respectively. Further analysis on masking proves that removing those domain-related and sentiment irrelevant tokens decreases texts' domain separability, resulting in the performance degradation of a BERT-based domain classifier by over 12%.
引用
下载
收藏
页码:2711 / 2724
页数:14
相关论文
共 50 条
  • [31] Co-Regularized Adversarial Learning for Multi-Domain Text Classification
    Wu, Yuan
    Inkpen, Diana
    El-Roby, Ahmed
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [32] Transfer Learning for the Multilingual and Multi-Domain Classification of Messages Relating to Crises
    Sanchez, Cinthia
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2708 - 2708
  • [33] Dual Adversarial Co-Learning for Multi-Domain Text Classification
    Wu, Yuan
    Guo, Yuhong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6438 - 6445
  • [34] Fuzzy Semantic Classification of Multi-Domain E-Learning Concept
    Rafeeq Ahmed
    Tanvir Ahmad
    Fadiyah M. Almutairi
    Abdulrahman M. Qahtani
    Abdulmajeed Alsufyani
    Omar Almutiry
    Mobile Networks and Applications, 2021, 26 : 2206 - 2215
  • [35] Domain adaptive learning for multi realm sentiment classification on big data
    Ijaz, Maha
    Anwar, Naveed
    Safran, Mejdl
    Alfarhood, Sultan
    Sadad, Tariq
    Imran
    PLOS ONE, 2024, 19 (04):
  • [36] Overlapped segment shared protection in multi-domain networks
    Truong, D. L.
    Jaumard, B.
    NETWORK ARCHITECTURES, MANAGEMENT, AND APPLICATIONS IV, 2006, 6354
  • [37] Sentiment lexicon for cross-domain adaptation with multi-domain dataset in Indian languages enhanced with BERT classification model
    Kumar, K. Suresh
    Sulochana, C. Helen
    Radhamani, A. S.
    Kumar, T. Ananth
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 6433 - 6450
  • [38] An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning
    Ghosh M.
    Sanyal G.
    Ghosh, Monalisa (monalisa_05mca@yahoo.com), 2018, SpringerOpen (05)
  • [39] How Reliable Is Sentiment Analysis? A Multi-domain Empirical Investigation
    Ding, Tao
    Pan, Shimei
    WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST 2016), 2017, 292 : 37 - 57
  • [40] Building Large Arabic Multi-domain Resources for Sentiment Analysis
    ElSahar, Hady
    El-Beltagy, Samhaa R.
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 23 - 34