Voice Privacy Through Time-Scale and Pitch Modification

被引：0

作者：

Prajapati, Gauri P. ^{[1
]}

Singh, Dipesh K. ^{[1
]}

Patil, Hemant A. ^{[1
]}

机构：

[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India

来源：

PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021 | 2024年 / 13102卷

关键词：

Voice privacy; speech perturbation; anonymization; SPEAKER;

D O I：

10.1007/978-3-031-12700-7_8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An attacker can fraudulently get access (instead of the genuine user) if the users' speech data has not been preserved by using any protection. Hence, it is important to protect users' speech data for which a voice privacy system can be employed. A voice privacy system is not designed based on any particular kind of attack. Instead, it is designed in a generalized way, making it as universal system. This study presents the time-scale and pitch modification-based anonymization methods to modify the speaker-dependent speech parameters (i.e., F-0) for better privacy preservation of speech data. The proposed voice privacy performance is compared with the signal processing-based baseline system of the INTERSPEECH 2020 voice privacy challenge. The authors have used various perturbation methods, concluding that speed perturbation with factor 0.8 is better to get adequate speaker anonymization (with 38.5% Equal Error Rate (EER) and 91.3% De-IDentification (DeID)) and acceptable speech intelligibility (4.86% WER) for female speakers. It is observed that speed and pitch perturbation are two important candidates for anonymization. However, the tempo perturbation is not found to be so useful for speaker anonymization.

引用

页码：72 / 80

页数：9

共 50 条

[21] Adaptive delay concealment for internet voice applications with packet-based time-scale modification
Liu, F
Kim, JW
Kuo, CCJ
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1461 - 1464
[22] A Review of Time-Scale Modification of Music Signals
Driedger, Jonathan
Mueller, Meinard
APPLIED SCIENCES-BASEL, 2016, 6 (02):
[23] Objective quality measurement for audio time-scale modification
Liu, F
Lee, JJ
Kuo, CCJ
INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 208 - 216
[24] A time-scale modification dataset with subjective quality labels
Roberts, Timothy
Paliwal, Kuldip K.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 148 (01): : 201 - 210
[25] A simple hybrid approach to the time-scale modification of speech
Knox, D
Bailey, N
Stewart, I
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2005, 53 (7-8): : 612 - 619
[26] Data embedding in audio using time-scale modification
Mansour, MF
Tewfik, AH
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 432 - 440
[27] A simple hybrid approach to the time-scale modification of speech
Knox, D. (D.Knox@gcal.ac.uk), 1600, Audio Engineering Society, 60 East 42nd Street, New York, NY 10165-0075, United States (53): : 7 - 8
[28] A Time-scale Alternation Method in GMM Voice Conversion System
Zhou Ying
Zhang Ling-hua
ELECTRONIC INFORMATION AND ELECTRICAL ENGINEERING, 2012, 19 : 161 - 164
[29] Improved phase vocoder time-scale modification of audio
Laroche, J
Dolson, M
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (03): : 323 - 332
[30] MATHEMATICAL FRAMEWORK FOR TIME-SCALE MODIFICATION OF SPEECH SIGNALS
PORTNOFF, MR
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 : S68 - S69

← 1 2 3 4 5 →