Representation and Labeling Gap Bridging for Cross-lingual Named Entity Recognition

被引:0
|
作者
Zhang, Xinghua [1 ,2 ]
Yu, Bowen [3 ]
Cao, Jiangxia [1 ,2 ]
Li, Quangang [1 ]
Wang, Xuebin [1 ]
Liu, Tingwen [1 ]
Xu, Hongbo [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
关键词
Low Resource; Cross-lingual Transfer; Knowledge Acquisition;
D O I
10.1145/3539618.3591757
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-lingual Named Entity Recognition (NER) aims to address the challenge of data scarcity in low-resource languages by leveraging knowledge from high-resource languages. Most current work relies on general multilingual language models to represent text, and then uses classic combined tagging (e.g., B-ORG) to annotate entities; However, this approach neglects the lack of cross-lingual alignment of entity representations in language models, and also ignores the fact that entity spans and types have varying levels of labeling difficulty in terms of transferability. To address these challenges, we propose a novel framework, referred to as DLBri, which addresses the issues of representation and labeling simultaneously. Specifically, the proposed framework utilizes progressive contrastive learning with source-to-target oriented sentence pairs to pre-finetune the language model, resulting in improved cross-lingual entity-aware representations. Additionally, a decomposition-then-combination procedure is proposed, which separately transfers entity span and type, and then combines their information, to reduce the difficulty of cross-lingual entity labeling. Extensive experiments on 13 diverse language pairs confirm the effectiveness of DLBri. The code for this framework is available at https://github.com/AIRobotZhang/DLBri.
引用
下载
收藏
页码:1230 / 1240
页数:11
相关论文
共 50 条
  • [21] Analyzing transfer learning impact in biomedical cross-lingual named entity recognition and normalization
    Rivera-Zavala, Renzo M.
    Martinez, Paloma
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 1)
  • [22] Cross-lingual Named Entity List Search via Transliteration
    Khakhmovich, Aleksandr
    Pavlova, Svetlana
    Kirillova, Kira
    Arefyev, Nikolay
    Savilova, Ekaterina
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4247 - 4255
  • [23] Dynamic Gazetteer Integration in Multilingual Models for Cross-Lingual and Cross-Domain Named Entity Recognition
    Fetahu, Besnik
    Fang, Anjie
    Rokhlenko, Oleg
    Malmasi, Shervin
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2777 - 2790
  • [24] Enhancing Cross-Lingual Few-Shot Named Entity Recognition by Prompt-Guiding
    Wang, Yige
    Huang, Yucheng
    Gong, Tieliang
    Li, Chen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT I, 2023, 14254 : 159 - 170
  • [25] Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning
    Zhou, Ran
    Li, Xin
    Bing, Lidong
    Cambria, Erik
    Miao, Chunyan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4018 - 4031
  • [26] Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information
    Dandapat, Sandipan
    Way, Andy
    COMPUTACION Y SISTEMAS, 2016, 20 (03): : 495 - 504
  • [27] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [28] Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora
    Zirikly, Ayah
    Hagiwara, Masato
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 390 - 396
  • [29] UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
    Wu, Qianhui
    Lin, Zijia
    Karlsson, Borje F.
    Huang, Biqing
    Lou, Jian-Guang
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3926 - 3932
  • [30] A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition
    Al-Duwais, Mashael
    Al-Khalifa, Hend
    Al-Salman, Abdulmalik
    ELECTRONICS, 2024, 13 (17)