Tuning Multilingual Transformers for Named Entity Recognition on Slavic Languages

被引:0
|
作者
Arkhipov, Mikhail [1 ]
Trofimova, Maria [1 ]
Kuratov, Yuri [1 ]
Sorokin, Alexey [1 ,2 ]
机构
[1] Moscow Inst Phys & Technol, Neural Networks & Deep Learning Lab, Moscow, Russia
[2] Moscow MV Lomonosov State Univ, Fac Math & Mech, Moscow, Russia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our paper addresses the problem of multilingual named entity recognition on the material of 4 languages: Russian, Bulgarian, Czech and Polish. We solve this task using the BERT model. We use a hundred languages multilingual model as base for transfer to the mentioned Slavic languages. Unsupervised pre-training of the BERT model on these 4 languages allows to significantly outperform baseline neural approaches and multilingual BERT. Additional improvement is achieved by extending BERT with a word-level CRF layer. Our system was submitted to BSNLP 2019 Shared Task on Multilingual Named Entity Recognition and took the 1st place in 3 competition metrics out of 4 we participated in. We open-sourced NER models and BERT model pre-trained on the four Slavic languages.
引用
收藏
页码:89 / 93
页数:5
相关论文
共 50 条
  • [1] Multilingual Transformers for Named Entity Recognition
    Viksna, Rinalds
    Skadin, Inguna
    [J]. BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 457 - 469
  • [2] Agglutinative Languages Named Entity Recognition Based on Pruner and Multilingual Fine-Tuning
    Kai’ang, Luo
    Halidanmu, Abudukelimu
    Chang, Liu
    Abulizi, Abudukelimu
    Wenqiang, Guo
    [J]. Computer Engineering and Applications, 2023, 59 (24) : 121 - 130
  • [3] Firefly Algorithm Based Multilingual Named Entity Recognition for Indian Languages
    Biswas, Sitanath
    Dash, Sujata
    Acharya, Sweta
    [J]. ADVANCED INFORMATICS FOR COMPUTING RESEARCH, ICAICR 2018, PT I, 2019, 955 : 540 - 552
  • [4] Language Clustering for Multilingual Named Entity Recognition
    Shaffer, Kyle
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 40 - 45
  • [5] Adaptive, multilingual named entity recognition in Web pages
    Petasis, G
    Karkaletsis, V
    Grover, C
    Hachey, B
    Pazienza, MT
    Vindigni, M
    Coch, J
    [J]. ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 1073 - 1074
  • [6] Using WordNet Predicates for Multilingual Named Entity Recognition
    Negri, Matteo
    Magnini, Bernardo
    [J]. GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 169 - 174
  • [7] MasakhaNER: Named Entity Recognition for African Languages
    Adelani, David Ifeoluwa
    Abbott, Jade
    Neubig, Graham
    D'souza, Daniel
    Kreutzer, Julia
    Lignos, Constantine
    Palen-Michel, Chester
    Buzaaba, Happy
    Rijhwani, Shruti
    Ruder, Sebastian
    Mayhew, Stephen
    Azime, Israel Abebe
    Muhammad, Shamsuddeen H.
    Emezue, Chris Chinenye
    Nakatumba-Nabende, Joyce
    Ogayo, Perez
    Anuoluwapo, Aremu
    Gitau, Catherine
    Mbaye, Derguene
    Alabi, Jesujoba
    Yimam, Seid Muhie
    Gwadabe, Tajuddeen Rabiu
    Ezeani, Ignatius
    Niyongabo, Rubungo Andre
    Mukiibi, Jonathan
    Otiende, Verrah
    Orife, Iroro
    David, Davis
    Ngom, Samba
    Adewumi, Tosin
    Rayson, Paul
    Adeyemi, Mofetoluwa
    Muriuki, Gerald
    Anebi, Emmanuel
    Chukwuneke, Chiamaka
    Odu, Nkiruka
    Wairagala, Eric Peter
    Oyerinde, Samuel
    Siro, Clemencia
    Bateesa, Tobius Saul
    Oloyede, Temilola
    Wambui, Yvonne
    Akinode, Victor
    Nabagereka, Deborah
    Katusiime, Maurice
    Awokoya, Ayodele
    Mboup, Mouhamadane
    Gebreyohannes, Dibora
    Tilaye, Henok
    Nwaike, Kelechi
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1116 - 1131
  • [8] Learning multilingual named entity recognition from Wikipedia
    Nothman, Joel
    Ringland, Nicky
    Radford, Will
    Murphy, Tara
    Curran, James R.
    [J]. ARTIFICIAL INTELLIGENCE, 2013, 194 : 151 - 175
  • [9] Multilingual Fine-Grained Named Entity Recognition
    Lupancu, Viorica-Camelia
    Iftene, Adrian
    [J]. COMPUTER SCIENCE JOURNAL OF MOLDOVA, 2023, 31 (03) : 321 - 339
  • [10] On the Strength of Character Language Models for Multilingual Named Entity Recognition
    Yu, Xiaodong
    Mayhew, Stephen
    Sammons, Mark
    Roth, Dan
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3073 - 3077