Voice Privacy Using Time-Scale and Pitch Modification

被引：0

作者：

Singh D.K. ^{[1
]}

Prajapati G.P. ^{[1
]}

Patil H.A. ^{[1
]}

机构：

[1] Speech Research Lab, Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar

来源：

SN Computer Science | / 5卷 / 2期

关键词：

Anonymization; Data augmentation; Speech perturbation; Voice privacy;

D O I：

10.1007/s42979-023-02549-8

中图分类号：

学科分类号：

摘要：

There is a growing demand toward digitization of various day-to-day work and hence, there is a surge in use of Intelligent Personal Assistants. The extensive use of these smart digital assistants asks for security and privacy preservation techniques because they use personally identifiable characteristics of the user. To that effect, various privacy preservation techniques for different types of voice assistants have been explored. Hence, for voice-based digital assistants, we need a privacy preservation technique. Thus, in this study, we explored the prosody modification methods to modify speaker-specific characteristics of the user, so that the modified utterances can then be made publicly available to use for training of different speech-based systems. This study presents three data augmentation techniques as voice anonymization methods to modify the speaker-dependent speech parameters (i.e., F). The voice anonymization and speech intelligibility are measured objectively using the automatic speaker verification (ASV) and automatic speech recognition (ASR) experiments, respectively, on development and test set of Librispeech dataset. For speed perturbation-based anonymization, up to 53.7% relative increased % EER is observed for a perturbation factor, α= 0.8 for both male and female speakers. For the same case, the % WER was adequate (less than the baseline system), reflecting the use of speed perturbation method as anonymization algorithm in a voice privacy system. The similar performance is observed for pitch perturbation with perturbation factor, λ= - 300 . However, the tempo perturbation could not found to be useful for speaker anonymization during the experiments with % EER in the order of 5–10 % . © 2024, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.

引用

共 50 条

[41] Approach for time-scale modification of speech based on TCNMF
Wu, Haijia
Zhang, Xiongwei
Huang, Jianjun
Chen, Weiwei
ELECTRONICS LETTERS, 2013, 49 (01) : 71 - 72
[42] An objective measure of quality for time-scale modification of audio
Roberts, Timothy
Paliwal, Kuldip K.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (03): : 1843 - 1854
[43] Time domain technique for pitch modification and robust voice transformation
Vergin, R
OShaughnessy, D
Farhat, A
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 947 - 950
[44] Glottal closure instant and voice source analysis using time-scale lines of maximum amplitude
CHRISTOPHE D’ALESSANDRO
NICOLAS STURMEL
Sadhana, 2011, 36 : 601 - 622
[45] Improving Time-Scale Modification of Music Signals Using Harmonic-Percussive Separation
Driedger, Jonathan
Mueller, Meinard
Ewert, Sebastian
IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (01) : 105 - 109
[46] Glottal closure instant and voice source analysis using time-scale lines of maximum amplitude
D'Alessandro, Christophe
Sturmel, Nicolas
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 601 - 622
[47] Mach1: Nonuniform time-scale modification of speech
Covell, M
Withgott, M
Slaney, M
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 349 - 352
[48] TSM TOOLBOX: MATLAB IMPLEMENTATIONS OF TIME-SCALE MODIFICATION ALGORITHMS
Driedger, Jonathan
Mueller, Meinard
DAFX-14: 17TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, 2014, : 249 - 256
[49] A Spectral Variation Function for Variable Time-Scale Modification of Speech
Kachare, Pramod H.
Pandey, Prem C.
2021 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2021, : 48 - 52
[50] Real-time pitch modification system for speech and singing voice
Azarov, Elias
Vashkevich, Maxim
Likhachov, Denis
Petrovsky, Alexander
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1070 - 1071

← 1 2 3 4 5 →