Voice Privacy Through Time-Scale and Pitch Modification

被引：0

作者：

Prajapati, Gauri P. ^{[1
]}

Singh, Dipesh K. ^{[1
]}

Patil, Hemant A. ^{[1
]}

机构：

[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India

来源：

PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021 | 2024年 / 13102卷

关键词：

Voice privacy; speech perturbation; anonymization; SPEAKER;

D O I：

10.1007/978-3-031-12700-7_8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An attacker can fraudulently get access (instead of the genuine user) if the users' speech data has not been preserved by using any protection. Hence, it is important to protect users' speech data for which a voice privacy system can be employed. A voice privacy system is not designed based on any particular kind of attack. Instead, it is designed in a generalized way, making it as universal system. This study presents the time-scale and pitch modification-based anonymization methods to modify the speaker-dependent speech parameters (i.e., F-0) for better privacy preservation of speech data. The proposed voice privacy performance is compared with the signal processing-based baseline system of the INTERSPEECH 2020 voice privacy challenge. The authors have used various perturbation methods, concluding that speed perturbation with factor 0.8 is better to get adequate speaker anonymization (with 38.5% Equal Error Rate (EER) and 91.3% De-IDentification (DeID)) and acceptable speech intelligibility (4.86% WER) for female speakers. It is observed that speed and pitch perturbation are two important candidates for anonymization. However, the tempo perturbation is not found to be so useful for speaker anonymization.

引用

页码：72 / 80

页数：9

共 50 条

[41] Real-time pitch modification system for speech and singing voice
Azarov, Elias
Vashkevich, Maxim
Likhachov, Denis
Petrovsky, Alexander
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1070 - 1071
[42] A hybrid time-frequency domain approach to audio time-scale modification
Dorran, David
Lawlor, Robert
Coyle, Eugene
AES: Journal of the Audio Engineering Society, 1600, 54 (1-2): : 21 - 31
[43] A hybrid time-frequency domain approach to audio time-scale modification
Dorran, D
Lawlor, R
Coyle, E
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2006, 54 (1-2): : 21 - 31
[44] A Time-Scale Modification-Based Voice Changing Method with Seamless Switching and Its Real-Time Implementation on Digital Imaging Devices
Lee, Young Han
Jo, Sung Dong
Park, Ji Hun
Kim, Duk Su
Park, Nam In
Kim, Hong Kook
Kim, Ji Woon
Kim, Myeong Bo
Kim, Sang Ryong
INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (03): : 1303 - 1315
[45] A Real-Time Framework for Video Time and Pitch Scale Modification
Damnjanovic, Ivan
Barry, Dan
Dorran, David
Reiss, Joshua D.
IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (04) : 247 - 256
[46] AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
Zhou Ying Zhang LinghuaCollege of Telecommunications Information EngineeringNanjing University of Posts and Telecommunications Nanjing China
Journal of Electronics(China), 2011, 28(Z1) (China) : 518 - 523
[47] Time-scale modification of audio signals with combined harmonic and wavelet representations
Hamdy, KN
Tewfik, AH
Chen, T
Takagi, S
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 439 - 442
[48] Localized audio watermarking technique robust against time-scale modification
Li, W
Xue, XY
Lu, PZ
IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (01) : 60 - 69
[49] Shape invariant time-scale modification of speech using a harmonic model
O'Brien, Darragh
Monaghan, Alex
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 381 - 384
[50] An efficient audio time-scale modification algorithm for use in a subband implementation
Dorran, D
Lawlor, R
DAFX-03: 6TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, PROCEEDINGS, 2003, : 339 - 343

← 1 2 3 4 5 →