Cross-Lingual Named Entity Recognition for Heterogenous Languages

被引:1
|
作者
Fu, Yingwen [1 ]
Lin, Nankai [2 ]
Chen, Boyu [3 ]
Yang, Ziyu [1 ]
Jiang, Shengyi [1 ,4 ]
机构
[1] Guangdong Univ Foreign Studies, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[2] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[3] UCL, Inst Hlth Informat, London WC1E 6BT, England
[4] Guangdong Univ Foreign Studies, Guangzhou Key Lab Multilingual Intelligent Proc, Guangzhou 510006, Guangdong, Peoples R China
关键词
Training; Data models; Standards; Speech processing; Optimization; Knowledge transfer; Information science; Cross-lingual named entity recognition; heterogenous language; weakly supervised learning; bilateral-branch network; self-distillation;
D O I
10.1109/TASLP.2022.3212698
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Previous works on cross-lingual Named Entity Recognition (NER) have achieved great success. However, few of them consider the effect of language families between the source and target languages. In this study, we find that the cross-lingual NER performance of a target language would decrease when its source language is changed from the same (homogenous) into a different (heterogenous) language family with that target language. To improve the NER performance in this situation, we propose a novel cross-lingual NER framework based on self-distillation mechanism and Bilateral-Branch Network (SD-BBN). SD-BBN learns source-language NER knowledge from supervised datasets and obtains target-language knowledge from weakly supervised datasets. These two kinds of knowledge are then fused based on self-distillation mechanism for better identifying entities in the target language. We evaluate SD-BBN on 9 language datasets from 4 different language families. Results show that SD-BBN tends to outperform baseline methods. Remarkably, when the target and source languages are heterogenous, SD-BBN can achieve a greater boost. Our results might suggest that obtaining language-specific knowledge from the target language is essential for improving cross-lingual NER when the source and target languages are heterogenous. This finding could provide a novel insight into further research.
引用
下载
收藏
页码:371 / 382
页数:12
相关论文
共 50 条
  • [21] Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection
    Ni, Jian
    Dinu, Georgiana
    Florian, Radu
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1470 - 1480
  • [22] Enhancing Cross-Lingual Few-Shot Named Entity Recognition by Prompt-Guiding
    Wang, Yige
    Huang, Yucheng
    Gong, Tieliang
    Li, Chen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT I, 2023, 14254 : 159 - 170
  • [23] Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning
    Zhou, Ran
    Li, Xin
    Bing, Lidong
    Cambria, Erik
    Miao, Chunyan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4018 - 4031
  • [24] Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information
    Dandapat, Sandipan
    Way, Andy
    COMPUTACION Y SISTEMAS, 2016, 20 (03): : 495 - 504
  • [25] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [26] Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora
    Zirikly, Ayah
    Hagiwara, Masato
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 390 - 396
  • [27] UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
    Wu, Qianhui
    Lin, Zijia
    Karlsson, Borje F.
    Huang, Biqing
    Lou, Jian-Guang
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3926 - 3932
  • [28] A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition
    Al-Duwais, Mashael
    Al-Khalifa, Hend
    Al-Salman, Abdulmalik
    ELECTRONICS, 2024, 13 (17)
  • [29] An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition
    Li, Zhuoran
    Hu, Chunming
    Guo, Xiaohui
    Chen, Junfan
    Qin, Wenyi
    Zhang, Richong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 170 - 179
  • [30] The Second Cross-Lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic Languages
    Piskorski, Jakub
    Laskova, Laska
    Marcinczuk, Michal
    Pivovarova, Lidia
    Priban, Pavel
    Steinberger, Josef
    Yangarber, Roman
    7TH WORKSHOP ON BALTO-SLAVIC NATURAL LANGUAGE PROCESSING (BSNLP'2019), 2019, : 63 - 74