Preventing Author Profiling through Zero-Shot Multilingual Back-Translation

被引:0
|
作者
Adelani, David Ifeoluwa [1 ]
Zhang, Miaoran [1 ]
Shen, Xiaoyu [1 ]
Davody, Ali [1 ]
Kleinbauer, Thomas [1 ]
Klakow, Dietrich [1 ]
机构
[1] Saarland Univ, Spoken Language Syst Grp, Saarland Informat Campus, Saarbrucken, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e.g. their gender or ethnicity. Style transfer is an effective way of transforming texts in order to remove any information that enables author profiling. However, for a number of current state-of-the-art approaches the improved privacy is accompanied by an undesirable drop in the downstream utility of the transformed data. In this paper, we propose a simple, zero-shot way to effectively lower the risk of author profiling through multilingual back-translation using off-the-shelf translation models. We compare our models with five representative text style transfer models on three datasets across different domains. Results from both an automatic and a human evaluation show that our approach achieves the best overall performance while requiring no training data. We are able to lower the adversarial prediction of gender and race by up to 22% while retaining 95% of the original utility on downstream tasks.
引用
收藏
页码:8687 / 8695
页数:9
相关论文
共 50 条
  • [1] Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
    Zhang, Biao
    Williams, Philip
    Titov, Ivan
    Sennrich, Rico
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1628 - 1639
  • [2] Multilingual translation for zero-shot biomedical classification using BioTranslator
    Xu, Hanwen
    Woicik, Addie
    Poon, Hoifung
    Altman, Russ B.
    Wang, Sheng
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [3] Multilingual translation for zero-shot biomedical classification using BioTranslator
    Hanwen Xu
    Addie Woicik
    Hoifung Poon
    Russ B. Altman
    Sheng Wang
    [J]. Nature Communications, 14
  • [4] Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes
    Zheng, Yafang
    Lin, Lei
    Yuan, Yuxuan
    Shi, Xiaodong
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI, 2024, 14452 : 226 - 238
  • [5] TACKLING DATA SCARCITY IN SPEECH TRANSLATION USING ZERO-SHOT MULTILINGUAL MACHINE TRANSLATION TECHNIQUES
    Tu Anh Dinh
    Liu, Danni
    Niehues, Jan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6222 - 6226
  • [6] An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation
    Raganato, Alessandro
    Vazquez, Raul
    Creutz, Mathias
    Tiedemann, Jorg
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8449 - 8456
  • [7] Neural Machine Translation Based on Back-Translation for Multilingual Translation Evaluation Task
    Lai, Siyu
    Yang, Yueting
    Xu, Jin'an
    Chen, Yufeng
    Huang, Hui
    [J]. MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 132 - 141
  • [8] Style Transfer Through Back-Translation
    Prabhumoye, Shrimai
    Tsvetkov, Yulia
    Salakhutdinov, Ruslan
    Black, Alan W.
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 866 - 876
  • [9] ENABLING ZERO-SHOT MULTILINGUAL SPOKEN LANGUAGE TRANSLATION WITH LANGUAGE-SPECIFIC ENCODERS AND DECODERS
    Escolano, Carlos
    Costa-jussa, Marta R.
    Fonollosa, Jose A. R.
    Segura, Carlos
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 694 - 701
  • [10] Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents
    Zhang, Biao
    Bapna, Ankur
    Johnson, Melvin
    Dabirmoghaddam, Ali
    Arivazhagan, Naveen
    Firat, Orhan
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4176 - 4192