Representation and Labeling Gap Bridging for Cross-lingual Named Entity Recognition

被引:0
|
作者
Zhang, Xinghua [1 ,2 ]
Yu, Bowen [3 ]
Cao, Jiangxia [1 ,2 ]
Li, Quangang [1 ]
Wang, Xuebin [1 ]
Liu, Tingwen [1 ]
Xu, Hongbo [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
关键词
Low Resource; Cross-lingual Transfer; Knowledge Acquisition;
D O I
10.1145/3539618.3591757
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-lingual Named Entity Recognition (NER) aims to address the challenge of data scarcity in low-resource languages by leveraging knowledge from high-resource languages. Most current work relies on general multilingual language models to represent text, and then uses classic combined tagging (e.g., B-ORG) to annotate entities; However, this approach neglects the lack of cross-lingual alignment of entity representations in language models, and also ignores the fact that entity spans and types have varying levels of labeling difficulty in terms of transferability. To address these challenges, we propose a novel framework, referred to as DLBri, which addresses the issues of representation and labeling simultaneously. Specifically, the proposed framework utilizes progressive contrastive learning with source-to-target oriented sentence pairs to pre-finetune the language model, resulting in improved cross-lingual entity-aware representations. Additionally, a decomposition-then-combination procedure is proposed, which separately transfers entity span and type, and then combines their information, to reduce the difficulty of cross-lingual entity labeling. Extensive experiments on 13 diverse language pairs confirm the effectiveness of DLBri. The code for this framework is available at https://github.com/AIRobotZhang/DLBri.
引用
下载
收藏
页码:1230 / 1240
页数:11
相关论文
共 50 条
  • [1] Cross-lingual Named Entity Recognition
    Steinberger, Ralf
    Pouliquen, Bruno
    LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 135 - 162
  • [2] WASSERSTEIN CROSS-LINGUAL ALIGNMENT FOR NAMED ENTITY RECOGNITION
    Wang, Rui
    Henao, Ricardo
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8342 - 8346
  • [3] Cross-Lingual Named Entity Recognition for Heterogenous Languages
    Fu, Yingwen
    Lin, Nankai
    Chen, Boyu
    Yang, Ziyu
    Jiang, Shengyi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 371 - 382
  • [4] Neural Cross-Lingual Named Entity Recognition with Minimal Resources
    Xie, Jiateng
    Yang, Zhilin
    Neubig, Graham
    Smith, Noah A.
    Carbonell, Jaime
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 369 - 379
  • [5] Zero-Resource Cross-Lingual Named Entity Recognition
    Bari, M. Saiful
    Joty, Shafiq
    Jwalapuram, Prathyusha
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7415 - 7423
  • [6] Cross-lingual Transfer Learning for Japanese Named Entity Recognition
    Johnson, Andrew
    Karanasou, Penny
    Gaspers, Judith
    Klakow, Dietrich
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES(NAACL HLT 2019), VOL. 2 (INDUSTRY PAPERS), 2019, : 182 - 189
  • [7] Cross-Lingual Transfer Learning for Medical Named Entity Recognition
    Ding, Pengjie
    Wang, Lei
    Liang, Yaobo
    Lu, Wei
    Li, Linfeng
    Wang, Chun
    Tang, Buzhou
    Yan, Jun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 403 - 418
  • [8] Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection
    Ni, Jian
    Dinu, Georgiana
    Florian, Radu
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1470 - 1480
  • [9] Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
    Liang, Shining
    Gong, Ming
    Pei, Jian
    Shou, Linjun
    Zuo, Wanli
    Zuo, Xianglin
    Jiang, Daxin
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3231 - 3239
  • [10] Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training
    Wang, Hao
    Zhou, Lekai
    Duan, Jianyong
    He, Li
    APPLIED SCIENCES-BASEL, 2023, 13 (04):