Voice Privacy Through Time-Scale and Pitch Modification

被引:0
|
作者
Prajapati, Gauri P. [1 ]
Singh, Dipesh K. [1 ]
Patil, Hemant A. [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India
关键词
Voice privacy; speech perturbation; anonymization; SPEAKER;
D O I
10.1007/978-3-031-12700-7_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An attacker can fraudulently get access (instead of the genuine user) if the users' speech data has not been preserved by using any protection. Hence, it is important to protect users' speech data for which a voice privacy system can be employed. A voice privacy system is not designed based on any particular kind of attack. Instead, it is designed in a generalized way, making it as universal system. This study presents the time-scale and pitch modification-based anonymization methods to modify the speaker-dependent speech parameters (i.e., F-0) for better privacy preservation of speech data. The proposed voice privacy performance is compared with the signal processing-based baseline system of the INTERSPEECH 2020 voice privacy challenge. The authors have used various perturbation methods, concluding that speed perturbation with factor 0.8 is better to get adequate speaker anonymization (with 38.5% Equal Error Rate (EER) and 91.3% De-IDentification (DeID)) and acceptable speech intelligibility (4.86% WER) for female speakers. It is observed that speed and pitch perturbation are two important candidates for anonymization. However, the tempo perturbation is not found to be so useful for speaker anonymization.
引用
收藏
页码:72 / 80
页数:9
相关论文
共 50 条
  • [21] Adaptive delay concealment for internet voice applications with packet-based time-scale modification
    Liu, F
    Kim, JW
    Kuo, CCJ
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1461 - 1464
  • [22] A Review of Time-Scale Modification of Music Signals
    Driedger, Jonathan
    Mueller, Meinard
    APPLIED SCIENCES-BASEL, 2016, 6 (02):
  • [23] Objective quality measurement for audio time-scale modification
    Liu, F
    Lee, JJ
    Kuo, CCJ
    INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 208 - 216
  • [24] A time-scale modification dataset with subjective quality labels
    Roberts, Timothy
    Paliwal, Kuldip K.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 148 (01): : 201 - 210
  • [25] A simple hybrid approach to the time-scale modification of speech
    Knox, D
    Bailey, N
    Stewart, I
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2005, 53 (7-8): : 612 - 619
  • [26] Data embedding in audio using time-scale modification
    Mansour, MF
    Tewfik, AH
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 432 - 440
  • [27] A simple hybrid approach to the time-scale modification of speech
    Knox, D. (D.Knox@gcal.ac.uk), 1600, Audio Engineering Society, 60 East 42nd Street, New York, NY 10165-0075, United States (53): : 7 - 8
  • [28] A Time-scale Alternation Method in GMM Voice Conversion System
    Zhou Ying
    Zhang Ling-hua
    ELECTRONIC INFORMATION AND ELECTRICAL ENGINEERING, 2012, 19 : 161 - 164
  • [29] Improved phase vocoder time-scale modification of audio
    Laroche, J
    Dolson, M
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (03): : 323 - 332
  • [30] MATHEMATICAL FRAMEWORK FOR TIME-SCALE MODIFICATION OF SPEECH SIGNALS
    PORTNOFF, MR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 : S68 - S69