Improved phase vocoder time-scale modification of audio

被引:138
|
作者
Laroche, J [1 ]
Dolson, M [1 ]
机构
[1] Joint Emu Creat Technol Ctr, Scotts Valley, CA 95067 USA
来源
关键词
phase coherence; phase vocoder; pitch shifting; short time Fourier transform; time scaling;
D O I
10.1109/89.759041
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The phase vocoder is a well-established tool for time scaling and pitch shifting speech and audio signals via modification of their short-time Fourier transforms (STFT's). In contrast to time-domain time-scaling and pitch-shifting techniques, the phase vocoder is generally considered to yield high quality results, especially for large modification factors and/or polyphonic signals. However, the phase vocoder is also known for introducing a characteristic perceptual artifact, often described as "phasiness," "reverberation," or "loss of presence." This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes. Two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved. Moreover, the modified phase vocoder is shown to provide a factor-of-two decrease in computational cost.
引用
收藏
页码:323 / 332
页数:10
相关论文
共 50 条
  • [31] A simple hybrid approach to the time-scale modification of speech
    Knox, D
    Bailey, N
    Stewart, I
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2005, 53 (7-8): : 612 - 619
  • [32] Voice privacy using CycleGAN and time-scale modification
    Prajapati, Gauri P.
    Singh, Dipesh K.
    Amin, Preet P.
    Patil, Hemant A.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 74
  • [33] A simple hybrid approach to the time-scale modification of speech
    [J]. Knox, D. (D.Knox@gcal.ac.uk), 1600, Audio Engineering Society, 60 East 42nd Street, New York, NY 10165-0075, United States (53): : 7 - 8
  • [34] MATHEMATICAL FRAMEWORK FOR TIME-SCALE MODIFICATION OF SPEECH SIGNALS
    PORTNOFF, MR
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 61 : S68 - S69
  • [35] SHAPE INVARIANT TIME-SCALE AND PITCH MODIFICATION OF SPEECH
    QUATIERI, TF
    MCAULAY, RJ
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (03) : 497 - 510
  • [36] Approach for time-scale modification of speech based on TCNMF
    Wu, Haijia
    Zhang, Xiongwei
    Huang, Jianjun
    Chen, Weiwei
    [J]. ELECTRONICS LETTERS, 2013, 49 (01) : 71 - 72
  • [37] Voice Privacy Using Time-Scale and Pitch Modification
    Singh D.K.
    Prajapati G.P.
    Patil H.A.
    [J]. SN Computer Science, 5 (2)
  • [38] Time-scale invariant audio watermarking based on the statistical features in time domain
    Xiang, Shijun
    Huang, Jiwu
    Yang, Rui
    [J]. INFORMATION HIDING, 2007, 4437 : 93 - +
  • [39] NONPARAMETRIC TECHNIQUES FOR PITCH-SCALE AND TIME-SCALE MODIFICATION OF SPEECH
    MOULINES, E
    LAROCHE, J
    [J]. SPEECH COMMUNICATION, 1995, 16 (02) : 175 - 205
  • [40] Energy-based nonuniform time-scale compression of audio signals
    Chu, WC
    Lashkari, K
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2003, 49 (01) : 183 - 187