X-vector anonymization using autoencoders and adversarial training for preserving speech privacy

被引:15
|
作者
Perero-Codosero, Juan M. [1 ,2 ]
Espinoza-Cuadros, Fernando M. [1 ,2 ]
Hernandez-Gomez, Luis A. [2 ]
机构
[1] Sigma Technol SLU, Madrid, Spain
[2] Univ Politecn Madrid, GAPS Signal Proc Applicat Grp, Madrid, Spain
来源
关键词
Speaker anonymization; Adversarial training; Autoencoders; Adversarial neural networks; Automatic speech recognition; Automatic speaker verification;
D O I
10.1016/j.csl.2022.101351
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid increase in web services and mobile apps, which collect personal data from users, has also increased the risk that their privacy may be severely compromised. In particular, the increasing variety of spoken language interfaces and voice assistants empowered by the vertiginous breakthroughs in deep learning have prompted important concerns in the European Union in terms of preserving the privacy of speech data. For instance, an attacker can record speech from users and impersonate them to obtain access to systems that require voice identification. By extracting speaker, linguistic (e.g., dialect), and paralinguistic features (e.g., age) from a speech signal, the speaker profiles can also be hacked from users through existing technology. To mitigate these weaknesses, in this study, we present a speech anonymization method based on autoencoders and adversarial training. Given an utterance, we first extract an x-vector, which is a powerful utterance-level embedding used in state-of-the-art speaker recognition. This original x-vector is transformed by an autoencoder producing a new x-vector, where speaker, gender, and accent information are suppressed through adversarial training. The anonymized speech is finally generated through a neural speech synthesizer driven by the anonymized x-vector, fundamental frequency, and phoneme information extracted from the input speech. For the evaluation, we followed the VoicePrivacy Challenge framework, where anonymiza-tion or privacy is measured using automatic speaker verification and the preservation of the intelligibility is evaluated through automatic speech recognition. Our experimental results show that the proposed method achieves higher privacy than the VoicePrivacy baseline (i.e., a higher speaker verification error) while preserving a similar intelligibility for the spoken content (i.e., a similar word error rate).
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Privacy and Utility of X-Vector Based Speaker Anonymization
    Srivastava, Brij Mohan Lal
    Maouche, Mohamed
    Sahidullah, Md
    Vincent, Emmanuel
    Bellet, Aurelien
    Tommasi, Marc
    Tomashenko, Natalia
    Wang, Xin
    Yamagishi, Junichi
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2383 - 2395
  • [2] Voice Privacy Through x-vector and CycleGAN-based Anonymization
    Prajapati, Gauri P.
    Singh, Dipesh K.
    Amin, Preet P.
    Patil, Hemant A.
    [J]. INTERSPEECH 2021, 2021, : 1684 - 1688
  • [3] Design Choices for X-vector Based Speaker Anonymization
    Srivastava, Brij Mohan Lal
    Tomashenko, N.
    Wang, Xin
    Vincent, Emmanuel
    Yamagishi, Junichi
    Maouche, Mohamed
    Bellet, Aurelien
    Tommasi, Marc
    [J]. INTERSPEECH 2020, 2020, : 1713 - 1717
  • [4] Speaker anonymization by modifying fundamental frequency and x-vector singular value
    Mawalim, Candy Olivia
    Galajit, Kasorn
    Karnjana, Jessada
    Kidani, Shunsuke
    Unoki, Masashi
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 73
  • [5] Encrypted Semantic Communication Using Adversarial Training for Privacy Preserving
    Luo, Xinlai
    Chen, Zhiyong
    Tao, Meixia
    Yang, Feng
    [J]. IEEE COMMUNICATIONS LETTERS, 2023, 27 (06) : 1486 - 1490
  • [6] Learning Privacy Preserving Encodings through Adversarial Training
    Pittaluga, Francesco
    Koppal, Sanjeev J.
    Chakrabarti, Ayan
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 791 - 799
  • [7] Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features
    Williams, Jennifer
    Rownicka, Joanna
    [J]. INTERSPEECH 2019, 2019, : 1053 - 1057
  • [8] Privacy-preserving record linkage using autoencoders
    Christen, Victor
    Haentschel, Tim
    Christen, Peter
    Rahm, Erhard
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 15 (04) : 347 - 357
  • [9] Privacy-preserving record linkage using autoencoders
    Victor Christen
    Tim Häntschel
    Peter Christen
    Erhard Rahm
    [J]. International Journal of Data Science and Analytics, 2023, 15 : 347 - 357
  • [10] Anonymization: The imperfect science of using data while preserving privacy
    Gadotti, Andrea
    Rocher, Luc
    Houssiau, Florimond
    Cretu, Ana-Maria
    de Montjoye, Yves-Alexandre
    [J]. SCIENCE ADVANCES, 2024, 10 (29):