Privacy-Preserving Speaker Verification and Speech Recognition

被引:0
|
作者
Abbasi, Wisam [1 ,2 ]
机构
[1] CNR, Ist Informat & Telemat, Pisa, Italy
[2] Univ Pisa, Dept Comp Sci, Pisa, Italy
关键词
Authentication; Data privacy; Privacy-preserving data analysis; Speaker verification; Speech recognition; CONVOLUTIONAL NEURAL-NETWORKS; IDENTIFICATION;
D O I
10.1007/978-3-031-25467-3_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an approach to speaker verification and speech recognition in environments that require authentication and privacy protection, while accuracy and data utility must remain high. Our methodology aims at protecting audio files and users' identities through the use of encryption and hashing algorithms, while at the same time providing accurate speaker's identity prediction. In addition, for speech recognition, we introduce a mechanism to anonymize the resulting transcript of the recognized spoken language using the Named Entity Recognition method by removing sensitive entities from the text according to the user's preferences. Furthermore, a privacy-preserving version of the original audio is obtained by performing a text-to-speech translation of the anonymized transcript, which together, the anonymous audio and transcript can be transmitted to third parties or service providers without violating privacy restrictions. The proposed methodology has been validated with a set of experiments on a well-known audio dataset, the Librispeech dataset. A type of Time Delay Neural Networks, ECAPA-TDNN was used for speaker verification, Deep Speech as a type of Recurrent Neural Networks was used for speech recognition, NER for entity recognition, cryptography and hashing for privacy protection. The results demonstrate the validity of our approach to protecting the privacy of user data and biometric information while simultaneously performing data analysis with a high degree of accuracy and similarity with the results obtained with no privacy mechanisms in place, also considering the use of several privacy mechanisms.
引用
收藏
页码:102 / 119
页数:18
相关论文
共 50 条
  • [1] PRIVACY-PRESERVING SPEAKER VERIFICATION AS PASSWORD MATCHING
    Pathak, Manas A.
    Raj, Bhiksha
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1849 - 1852
  • [2] Federated Learning for Privacy-Preserving Speaker Recognition
    Woubie, Abraham
    Backstrom, Tom
    [J]. IEEE ACCESS, 2021, 9 : 149477 - 149485
  • [3] Privacy-Preserving iVector-Based Speaker Verification
    Rahulamathavan, Yogachandran
    Sutharsini, Kunaraj R.
    Ray, Indranil Ghosh
    Lu, Rongxing
    Rajarajan, Muttukrishnan
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 496 - 506
  • [4] PRIVACY-PRESERVING SPEAKER VERIFICATION USING GARBLED GMMS
    Portelo, Jose
    Raj, Bhiksha
    Abad, Alberto
    Trancoso, Isabel
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2070 - 2074
  • [5] Privacy-Preserving Speaker Recognition with Cohort Score Normalisation
    Nautsch, Andreas
    Patino, Jose
    Treiber, Amos
    Stafylakis, Themos
    Mizera, Petr
    Todisco, Massimiliano
    Schneider, Thomas
    Evans, Nicholas
    [J]. INTERSPEECH 2019, 2019, : 2868 - 2872
  • [6] Configurable Privacy-Preserving Automatic Speech Recognition
    Aloufi, Ranya
    Haddadi, Hamed
    Boyle, David
    [J]. INTERSPEECH 2021, 2021, : 861 - 865
  • [7] PRIVACY-PRESERVING SOUND TO DEGRADE AUTOMATIC SPEAKER VERIFICATION PERFORMANCE
    Hashimoto, Kei
    Yamagishi, Junichi
    Echizen, Isao
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5500 - 5504
  • [8] Privacy-Preserving Speaker Verification using Secure Binary Embeddings
    Portelo, Jose
    Raj, Bhiksha
    Alberto, Abad
    Trancoso, Isabel
    [J]. 2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 1268 - 1272
  • [9] Privacy-preserving speaker verification using secure binary embeddings
    20143718152428
    [J]. (1) INESC-ID, Lisboa, Portugal; (2) Instituto Superior Técnico, Lisboa, Portugal; (3) Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, United States, 1600, Ericsson Nikola Tesla Zagreb; et al.; HEP - Croatian Electricity Company Zagreb; InfoDom Zagreb; Koncar-Electrical Industries Zagreb; T-Croatian Telecom Zagreb (IEEE Computer Society):
  • [10] Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models
    Pathak, Manas A.
    Raj, Bhiksha
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 397 - 406