X-vector anonymization using autoencoders and adversarial training for preserving speech privacy

被引：15

作者：

Perero-Codosero, Juan M. ^{[1
,2
]}

Espinoza-Cuadros, Fernando M. ^{[1
,2
]}

Hernandez-Gomez, Luis A. ^{[2
]}

机构：

[1] Sigma Technol SLU, Madrid, Spain

[2] Univ Politecn Madrid, GAPS Signal Proc Applicat Grp, Madrid, Spain

来源：

COMPUTER SPEECH AND LANGUAGE | 2022年 / 74卷

关键词：

Speaker anonymization; Adversarial training; Autoencoders; Adversarial neural networks; Automatic speech recognition; Automatic speaker verification;

D O I：

10.1016/j.csl.2022.101351

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The rapid increase in web services and mobile apps, which collect personal data from users, has also increased the risk that their privacy may be severely compromised. In particular, the increasing variety of spoken language interfaces and voice assistants empowered by the vertiginous breakthroughs in deep learning have prompted important concerns in the European Union in terms of preserving the privacy of speech data. For instance, an attacker can record speech from users and impersonate them to obtain access to systems that require voice identification. By extracting speaker, linguistic (e.g., dialect), and paralinguistic features (e.g., age) from a speech signal, the speaker profiles can also be hacked from users through existing technology. To mitigate these weaknesses, in this study, we present a speech anonymization method based on autoencoders and adversarial training. Given an utterance, we first extract an x-vector, which is a powerful utterance-level embedding used in state-of-the-art speaker recognition. This original x-vector is transformed by an autoencoder producing a new x-vector, where speaker, gender, and accent information are suppressed through adversarial training. The anonymized speech is finally generated through a neural speech synthesizer driven by the anonymized x-vector, fundamental frequency, and phoneme information extracted from the input speech. For the evaluation, we followed the VoicePrivacy Challenge framework, where anonymiza-tion or privacy is measured using automatic speaker verification and the preservation of the intelligibility is evaluated through automatic speech recognition. Our experimental results show that the proposed method achieves higher privacy than the VoicePrivacy baseline (i.e., a higher speaker verification error) while preserving a similar intelligibility for the spoken content (i.e., a similar word error rate).

引用

页数：13

共 50 条

[1] Privacy and Utility of X-Vector Based Speaker Anonymization
Srivastava, Brij Mohan Lal
Maouche, Mohamed
Sahidullah, Md
Vincent, Emmanuel
Bellet, Aurelien
Tommasi, Marc
Tomashenko, Natalia
Wang, Xin
Yamagishi, Junichi
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2383 - 2395
[2] Voice Privacy Through x-vector and CycleGAN-based Anonymization
Prajapati, Gauri P.
Singh, Dipesh K.
Amin, Preet P.
Patil, Hemant A.
[J]. INTERSPEECH 2021, 2021, : 1684 - 1688
[3] Design Choices for X-vector Based Speaker Anonymization
Srivastava, Brij Mohan Lal
Tomashenko, N.
Wang, Xin
Vincent, Emmanuel
Yamagishi, Junichi
Maouche, Mohamed
Bellet, Aurelien
Tommasi, Marc
[J]. INTERSPEECH 2020, 2020, : 1713 - 1717
[4] Speaker anonymization by modifying fundamental frequency and x-vector singular value
Mawalim, Candy Olivia
Galajit, Kasorn
Karnjana, Jessada
Kidani, Shunsuke
Unoki, Masashi
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 73
[5] Encrypted Semantic Communication Using Adversarial Training for Privacy Preserving
Luo, Xinlai
Chen, Zhiyong
Tao, Meixia
Yang, Feng
[J]. IEEE COMMUNICATIONS LETTERS, 2023, 27 (06) : 1486 - 1490
[6] Learning Privacy Preserving Encodings through Adversarial Training
Pittaluga, Francesco
Koppal, Sanjeev J.
Chakrabarti, Ayan
[J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 791 - 799
[7] Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features
Williams, Jennifer
Rownicka, Joanna
[J]. INTERSPEECH 2019, 2019, : 1053 - 1057
[8] Privacy-preserving record linkage using autoencoders
Christen, Victor
Haentschel, Tim
Christen, Peter
Rahm, Erhard
[J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 15 (04) : 347 - 357
[9] Privacy-preserving record linkage using autoencoders
Victor Christen
Tim Häntschel
Peter Christen
Erhard Rahm
[J]. International Journal of Data Science and Analytics, 2023, 15 : 347 - 357
[10] Anonymization: The imperfect science of using data while preserving privacy
Gadotti, Andrea
Rocher, Luc
Houssiau, Florimond
Cretu, Ana-Maria
de Montjoye, Yves-Alexandre
[J]. SCIENCE ADVANCES, 2024, 10 (29):

← 1 2 3 4 5 →