Transfer Joint Embedding for Cross-Domain Named Entity Recognition

被引:26
|
作者
Pan, Sinno Jialin [1 ]
Toh, Zhiqiang [1 ]
Su, Jian [1 ]
机构
[1] Inst Infocomm Res, Data Analyt Dept, Singapore 138632, Singapore
关键词
Algorithms; Experimentation; Named entity recognition; transfer learning; multiclass classification;
D O I
10.1145/2457465.2457467
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named Entity Recognition (NER) is a fundamental task in information extraction from unstructured text. Most previous machine-learning-based NER systems are domain-specific, which implies that they may only perform well on some specific domains (e.g., Newswire) but tend to adapt poorly to other related but different domains (e.g., Weblog). Recently, transfer learning techniques have been proposed to NER. However, most transfer learning approaches to NER are developed for binary classification, while NER is a multiclass classification problem in nature. Therefore, one has to first reduce the NER task to multiple binary classification tasks and solve them independently. In this article, we propose a new transfer learning method, named Transfer Joint Embedding (TJE), for cross-domain multiclass classification, which can fully exploit the relationships between classes (labels), and reduce domain difference in data distributions for transfer learning. More specifically, we aim to embed both labels (outputs) and high-dimensional features (inputs) from different domains (e.g., a source domain and a target domain) into a unified low-dimensional latent space, where 1) each label is represented by a prototype and the intrinsic relationships between labels can be measured by Euclidean distance; 2) the distance in data distributions between the source and target domains can be reduced; 3) the source domain labeled data are closer to their corresponding label-prototypes than others. After the latent space is learned, classification on the target domain data can be done with the simple nearest neighbor rule in the latent space. Furthermore, in order to scale up TJE, we propose an efficient algorithm based on stochastic gradient descent (SGD). Finally, we apply the proposed TJE method for NER across different domains on the ACE 2005 dataset, which is a benchmark in Natural Language Processing (NLP). Experimental results demonstrate the effectiveness of TJE and show that TJE can outperform state-of-the-art transfer learning approaches to NER.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] Cross-Domain Named Entity Recognition of Multi-Level Structured Semantic Knowledge Enhancement
    Zhang W.
    Liu X.
    Yang G.
    Liu J.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (12): : 2864 - 2876
  • [22] Deep cross-domain transfer for emotion recognition via joint learning
    Nguyen, Dung
    Nguyen, Duc Thanh
    Sridharan, Sridha
    Abdelrazek, Mohamed
    Denman, Simon
    Tran, Son N.
    Zeng, Rui
    Fookes, Clinton
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 22455 - 22472
  • [23] Deep cross-domain transfer for emotion recognition via joint learning
    Dung Nguyen
    Duc Thanh Nguyen
    Sridha Sridharan
    Mohamed Abdelrazek
    Simon Denman
    Son N. Tran
    Rui Zeng
    Clinton Fookes
    Multimedia Tools and Applications, 2024, 83 : 22455 - 22472
  • [24] A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media
    He, Hangfeng
    Sun, Xu
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3216 - 3222
  • [25] Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text
    Varma, Maya
    Orr, Laurel
    Wu, Sen
    Leszczynski, Megan
    Ling, Xiao
    Re, Christopher
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4566 - 4575
  • [26] Cross-Lingual Cross-Domain Nested Named Entity Evaluation on EnglishWeb Texts
    Plank, Barbara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1808 - 1815
  • [27] Data augmentation and transfer learning for cross-lingual Named Entity Recognition in the biomedical domain
    Lancheros, Brayan Stiven
    Pastor, Gloria Corpas
    Mitkov, Ruslan
    LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [28] Gait recognition with cross-domain transfer networks
    Tong, Suibing
    Fu, Yuzhuo
    Ling, Hefei
    JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 93 : 40 - 47
  • [29] CROSS-DOMAIN RECOGNITION BY IDENTIFYING COMPACT JOINT SUBSPACES
    Lin, Yuewei
    Chen, Jing
    Cao, Yu
    Zhou, Youjie
    Zhang, Lingfeng
    Wang, Song
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3461 - 3465
  • [30] Joint Learning of Named Entity Recognition and Entity Linking
    Martins, Pedro Henrique
    Marinho, Zita
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 190 - 196