Improved phase vocoder time-scale modification of audio

被引：138

作者：

Laroche, J ^{[1
]}

Dolson, M ^{[1
]}

机构：

[1] Joint Emu Creat Technol Ctr, Scotts Valley, CA 95067 USA

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1999年 / 7卷 / 03期

关键词：

phase coherence; phase vocoder; pitch shifting; short time Fourier transform; time scaling;

D O I：

10.1109/89.759041

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The phase vocoder is a well-established tool for time scaling and pitch shifting speech and audio signals via modification of their short-time Fourier transforms (STFT's). In contrast to time-domain time-scaling and pitch-shifting techniques, the phase vocoder is generally considered to yield high quality results, especially for large modification factors and/or polyphonic signals. However, the phase vocoder is also known for introducing a characteristic perceptual artifact, often described as "phasiness," "reverberation," or "loss of presence." This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes. Two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved. Moreover, the modified phase vocoder is shown to provide a factor-of-two decrease in computational cost.

引用

页码：323 / 332

页数：10

共 50 条

[1] Audio watermarking by time-scale modification
Mansour, MF
Tewfik, AH
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1353 - 1356
[2] Time-Scale and Pitch-Scale Modification by the Phase Vocoder without Occurring the Phase Unwrapping Problem
Yoneguchi, Ryoichi
Murakami, Takahiro
[J]. 2017 22ND INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2017,
[3] Objective quality measurement for audio time-scale modification
Liu, F
Lee, JJ
Kuo, CCJ
[J]. INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 208 - 216
[4] Data embedding in audio using time-scale modification
Mansour, MF
Tewfik, AH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 432 - 440
[5] Quality enhancement of packet audio with time-scale modification
Liu, F
Kim, JW
Kuo, CCJ
[J]. MULTIMEDIA SYSTEMS AND APPLICATIONS V, 2002, 4861 : 163 - 173
[6] An objective measure of quality for time-scale modification of audio
Roberts, Timothy
Paliwal, Kuldip K.
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (03): : 1843 - 1854
[7] A hybrid time-frequency domain approach to audio time-scale modification
Dorran, David
Lawlor, Robert
Coyle, Eugene
[J]. AES: Journal of the Audio Engineering Society, 1600, 54 (1-2): : 21 - 31
[8] A hybrid time-frequency domain approach to audio time-scale modification
Dorran, D
Lawlor, R
Coyle, E
[J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2006, 54 (1-2): : 21 - 31
[9] Localized audio watermarking technique robust against time-scale modification
Li, W
Xue, XY
Lu, PZ
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (01) : 60 - 69
[10] Time-scale modification of audio signals with combined harmonic and wavelet representations
Hamdy, KN
Tewfik, AH
Chen, T
Takagi, S
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 439 - 442

← 1 2 3 4 5 →