Improved phase vocoder time-scale modification of audio

被引:138
|
作者
Laroche, J [1 ]
Dolson, M [1 ]
机构
[1] Joint Emu Creat Technol Ctr, Scotts Valley, CA 95067 USA
来源
关键词
phase coherence; phase vocoder; pitch shifting; short time Fourier transform; time scaling;
D O I
10.1109/89.759041
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The phase vocoder is a well-established tool for time scaling and pitch shifting speech and audio signals via modification of their short-time Fourier transforms (STFT's). In contrast to time-domain time-scaling and pitch-shifting techniques, the phase vocoder is generally considered to yield high quality results, especially for large modification factors and/or polyphonic signals. However, the phase vocoder is also known for introducing a characteristic perceptual artifact, often described as "phasiness," "reverberation," or "loss of presence." This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes. Two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved. Moreover, the modified phase vocoder is shown to provide a factor-of-two decrease in computational cost.
引用
收藏
页码:323 / 332
页数:10
相关论文
共 50 条
  • [1] Audio watermarking by time-scale modification
    Mansour, MF
    Tewfik, AH
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1353 - 1356
  • [2] Time-Scale and Pitch-Scale Modification by the Phase Vocoder without Occurring the Phase Unwrapping Problem
    Yoneguchi, Ryoichi
    Murakami, Takahiro
    [J]. 2017 22ND INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2017,
  • [3] Objective quality measurement for audio time-scale modification
    Liu, F
    Lee, JJ
    Kuo, CCJ
    [J]. INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 208 - 216
  • [4] Data embedding in audio using time-scale modification
    Mansour, MF
    Tewfik, AH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 432 - 440
  • [5] Quality enhancement of packet audio with time-scale modification
    Liu, F
    Kim, JW
    Kuo, CCJ
    [J]. MULTIMEDIA SYSTEMS AND APPLICATIONS V, 2002, 4861 : 163 - 173
  • [6] An objective measure of quality for time-scale modification of audio
    Roberts, Timothy
    Paliwal, Kuldip K.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (03): : 1843 - 1854
  • [7] A hybrid time-frequency domain approach to audio time-scale modification
    Dorran, David
    Lawlor, Robert
    Coyle, Eugene
    [J]. AES: Journal of the Audio Engineering Society, 1600, 54 (1-2): : 21 - 31
  • [8] A hybrid time-frequency domain approach to audio time-scale modification
    Dorran, D
    Lawlor, R
    Coyle, E
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2006, 54 (1-2): : 21 - 31
  • [9] Localized audio watermarking technique robust against time-scale modification
    Li, W
    Xue, XY
    Lu, PZ
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (01) : 60 - 69
  • [10] Time-scale modification of audio signals with combined harmonic and wavelet representations
    Hamdy, KN
    Tewfik, AH
    Chen, T
    Takagi, S
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 439 - 442