Cross-Lingual Named Entity Recognition for Heterogenous Languages

被引:1
|
作者
Fu, Yingwen [1 ]
Lin, Nankai [2 ]
Chen, Boyu [3 ]
Yang, Ziyu [1 ]
Jiang, Shengyi [1 ,4 ]
机构
[1] Guangdong Univ Foreign Studies, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[2] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[3] UCL, Inst Hlth Informat, London WC1E 6BT, England
[4] Guangdong Univ Foreign Studies, Guangzhou Key Lab Multilingual Intelligent Proc, Guangzhou 510006, Guangdong, Peoples R China
关键词
Training; Data models; Standards; Speech processing; Optimization; Knowledge transfer; Information science; Cross-lingual named entity recognition; heterogenous language; weakly supervised learning; bilateral-branch network; self-distillation;
D O I
10.1109/TASLP.2022.3212698
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Previous works on cross-lingual Named Entity Recognition (NER) have achieved great success. However, few of them consider the effect of language families between the source and target languages. In this study, we find that the cross-lingual NER performance of a target language would decrease when its source language is changed from the same (homogenous) into a different (heterogenous) language family with that target language. To improve the NER performance in this situation, we propose a novel cross-lingual NER framework based on self-distillation mechanism and Bilateral-Branch Network (SD-BBN). SD-BBN learns source-language NER knowledge from supervised datasets and obtains target-language knowledge from weakly supervised datasets. These two kinds of knowledge are then fused based on self-distillation mechanism for better identifying entities in the target language. We evaluate SD-BBN on 9 language datasets from 4 different language families. Results show that SD-BBN tends to outperform baseline methods. Remarkably, when the target and source languages are heterogenous, SD-BBN can achieve a greater boost. Our results might suggest that obtaining language-specific knowledge from the target language is essential for improving cross-lingual NER when the source and target languages are heterogenous. This finding could provide a novel insight into further research.
引用
下载
收藏
页码:371 / 382
页数:12
相关论文
共 50 条
  • [1] Cross-lingual Named Entity Recognition
    Steinberger, Ralf
    Pouliquen, Bruno
    LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 135 - 162
  • [2] WASSERSTEIN CROSS-LINGUAL ALIGNMENT FOR NAMED ENTITY RECOGNITION
    Wang, Rui
    Henao, Ricardo
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8342 - 8346
  • [3] Neural Cross-Lingual Named Entity Recognition with Minimal Resources
    Xie, Jiateng
    Yang, Zhilin
    Neubig, Graham
    Smith, Noah A.
    Carbonell, Jaime
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 369 - 379
  • [4] Zero-Resource Cross-Lingual Named Entity Recognition
    Bari, M. Saiful
    Joty, Shafiq
    Jwalapuram, Prathyusha
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7415 - 7423
  • [5] Cross-lingual Transfer Learning for Japanese Named Entity Recognition
    Johnson, Andrew
    Karanasou, Penny
    Gaspers, Judith
    Klakow, Dietrich
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES(NAACL HLT 2019), VOL. 2 (INDUSTRY PAPERS), 2019, : 182 - 189
  • [6] Cross-Lingual Transfer Learning for Medical Named Entity Recognition
    Ding, Pengjie
    Wang, Lei
    Liang, Yaobo
    Lu, Wei
    Li, Linfeng
    Wang, Chun
    Tang, Buzhou
    Yan, Jun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 403 - 418
  • [7] Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
    Liang, Shining
    Gong, Ming
    Pei, Jian
    Shou, Linjun
    Zuo, Wanli
    Zuo, Xianglin
    Jiang, Daxin
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3231 - 3239
  • [8] Cross-Lingual Named Entity Recognition Based on Attention and Adversarial Training
    Wang, Hao
    Zhou, Lekai
    Duan, Jianyong
    He, Li
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [9] Representation and Labeling Gap Bridging for Cross-lingual Named Entity Recognition
    Zhang, Xinghua
    Yu, Bowen
    Cao, Jiangxia
    Li, Quangang
    Wang, Xuebin
    Liu, Tingwen
    Xu, Hongbo
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1230 - 1240
  • [10] Exploiting Morpheme and Cross-lingual Knowledge to Enhance Mongolian Named Entity Recognition
    Zhang, Songming
    Zhang, Ying
    Chen, Yufeng
    Wu, Du
    Xu, Jinan
    Liu, Jian
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)