Sakura at SemEval-2023 Task 2: Data Augmentation via Translation

被引:0
|
作者
Poncelas, Alberto [1 ]
Tkachenko, Maksim [1 ]
Htun, Ohnmar [1 ]
机构
[1] Rakuten Grp Inc, Rakuten Inst Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using machine translation. The named entity spans from the original sentences are transferred to the translations via word alignment and then filtered with the baseline recognizer to retain high quality annotations. The proposed data augmentation approach improves the baseline performance of XLM-Roberta on the multilingual dataset.
引用
收藏
页码:1718 / 1722
页数:5
相关论文
共 50 条
  • [1] Tsingriver at SemEval-2023 Task 10: Labeled Data Augmentation In Consistency Training
    Xu, Yehui
    Ding, Haiyan
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 782 - 786
  • [2] UAlberta at SemEval-2023 Task 1: Context Augmentation and Translation for Multilingual Visual Word Sense Disambiguation
    Ogezi, Michael
    Hauer, Bradley
    Omarov, Talgat
    Shi, Ning
    Kondrak, Grzegorz
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2043 - 2051
  • [3] QCon at SemEval-2023 Task 10: Data Augmentation and Model Ensembling for Detection of Online Sexism
    Feely, Weston
    Gupta, Prabhakar
    Mohanty, Manas
    Chon, Timothy
    Kundu, Tuhin
    Singh, Vijit
    Atluri, Sandeep
    Roosta, Tanya
    Ghaderi, Viviane
    Schulam, Peter
    Elfardy, Heba
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1260 - 1270
  • [4] CodeNLP at SemEval-2023 Task 2: Data Augmentation for Named Entity Recognition by Combination of Sequence Generation Strategies
    Marcinczuk, Michal
    Walentynowicz, Wiktor
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1798 - 1804
  • [5] NAP at SemEval-2023 Task 3: Is Less Really More? (Back-)Translation as Data Augmentation Strategies for Detecting Persuasion Techniques
    Falk, Neele
    Eichel, Annerose
    Piccirilli, Prisca
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1433 - 1446
  • [6] DSHacker at SemEval-2023 Task 3: Genres and Persuasion Techniques Detection with Multilingual Data Augmentation through Machine Translation and Text Generation
    Modzelewski, Arkadiusz
    Sosnowski, Witold
    Wilczynska, Magdalena
    Wierzbicki, Adam
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1582 - 1591
  • [7] SemEval-2023 Task 5: Clickbait Spoiling
    Froebe, Maik
    Gollub, Tim
    Stein, Benno
    Hagen, Matthias
    Potthast, Martin
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2275 - 2286
  • [8] CAISA at SemEval-2023 Task 8: Counterfactual Data Augmentation for Mitigating Class Imbalance in Causal Claim Identification
    Karimi, Akbar
    Flek, Lucie
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2118 - 2123
  • [9] Sea_and_Wine at SemEval-2023 Task 9: A Regression Model with Data Augmentation for Multilingual Intimacy Analysis
    Chen, Yuxi
    Chang, Yu
    Tao, Yanqing
    Zhang, Yanru
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 77 - 82
  • [10] WADER at SemEval-2023 Task 9: A Weak-labelling framework for Data augmentation in tExt Regression Tasks
    Suri, Manan
    Garg, Aaryak
    Chaudhary, Divya
    Gorton, Ian
    Kumar, Bijendra
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1945 - 1952