Privacy-Preserving Speaker Verification and Speech Recognition

被引：0

作者：

Abbasi, Wisam ^{[1
,2
]}

机构：

[1] CNR, Ist Informat & Telemat, Pisa, Italy

[2] Univ Pisa, Dept Comp Sci, Pisa, Italy

来源：

EMERGING TECHNOLOGIES FOR AUTHORIZATION AND AUTHENTICATION, ETAA 2022 | 2023年 / 13782卷

关键词：

Authentication; Data privacy; Privacy-preserving data analysis; Speaker verification; Speech recognition; CONVOLUTIONAL NEURAL-NETWORKS; IDENTIFICATION;

D O I：

10.1007/978-3-031-25467-3_7

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes an approach to speaker verification and speech recognition in environments that require authentication and privacy protection, while accuracy and data utility must remain high. Our methodology aims at protecting audio files and users' identities through the use of encryption and hashing algorithms, while at the same time providing accurate speaker's identity prediction. In addition, for speech recognition, we introduce a mechanism to anonymize the resulting transcript of the recognized spoken language using the Named Entity Recognition method by removing sensitive entities from the text according to the user's preferences. Furthermore, a privacy-preserving version of the original audio is obtained by performing a text-to-speech translation of the anonymized transcript, which together, the anonymous audio and transcript can be transmitted to third parties or service providers without violating privacy restrictions. The proposed methodology has been validated with a set of experiments on a well-known audio dataset, the Librispeech dataset. A type of Time Delay Neural Networks, ECAPA-TDNN was used for speaker verification, Deep Speech as a type of Recurrent Neural Networks was used for speech recognition, NER for entity recognition, cryptography and hashing for privacy protection. The results demonstrate the validity of our approach to protecting the privacy of user data and biometric information while simultaneously performing data analysis with a high degree of accuracy and similarity with the results obtained with no privacy mechanisms in place, also considering the use of several privacy mechanisms.

引用

页码：102 / 119

页数：18

共 50 条

[1] PRIVACY-PRESERVING SPEAKER VERIFICATION AS PASSWORD MATCHING
Pathak, Manas A.
Raj, Bhiksha
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1849 - 1852
[2] Federated Learning for Privacy-Preserving Speaker Recognition
Woubie, Abraham
Backstrom, Tom
[J]. IEEE ACCESS, 2021, 9 : 149477 - 149485
[3] Privacy-Preserving iVector-Based Speaker Verification
Rahulamathavan, Yogachandran
Sutharsini, Kunaraj R.
Ray, Indranil Ghosh
Lu, Rongxing
Rajarajan, Muttukrishnan
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 496 - 506
[4] PRIVACY-PRESERVING SPEAKER VERIFICATION USING GARBLED GMMS
Portelo, Jose
Raj, Bhiksha
Abad, Alberto
Trancoso, Isabel
[J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2070 - 2074
[5] Privacy-Preserving Speaker Recognition with Cohort Score Normalisation
Nautsch, Andreas
Patino, Jose
Treiber, Amos
Stafylakis, Themos
Mizera, Petr
Todisco, Massimiliano
Schneider, Thomas
Evans, Nicholas
[J]. INTERSPEECH 2019, 2019, : 2868 - 2872
[6] Configurable Privacy-Preserving Automatic Speech Recognition
Aloufi, Ranya
Haddadi, Hamed
Boyle, David
[J]. INTERSPEECH 2021, 2021, : 861 - 865
[7] PRIVACY-PRESERVING SOUND TO DEGRADE AUTOMATIC SPEAKER VERIFICATION PERFORMANCE
Hashimoto, Kei
Yamagishi, Junichi
Echizen, Isao
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5500 - 5504
[8] Privacy-Preserving Speaker Verification using Secure Binary Embeddings
Portelo, Jose
Raj, Bhiksha
Alberto, Abad
Trancoso, Isabel
[J]. 2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 1268 - 1272
[9] Privacy-preserving speaker verification using secure binary embeddings
20143718152428
[J]. (1) INESC-ID, Lisboa, Portugal; (2) Instituto Superior Técnico, Lisboa, Portugal; (3) Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, United States, 1600, Ericsson Nikola Tesla Zagreb; et al.; HEP - Croatian Electricity Company Zagreb; InfoDom Zagreb; Koncar-Electrical Industries Zagreb; T-Croatian Telecom Zagreb (IEEE Computer Society):
[10] Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models
Pathak, Manas A.
Raj, Bhiksha
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 397 - 406

← 1 2 3 4 5 →