A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids

被引:27
|
作者
Jensen, J [1 ]
Heusdens, R
Jensen, SH
机构
[1] Delft Univ Technol, Dept Mediamat, NL-2628 CD Delft, Netherlands
[2] Aalborg Univ, Dept Commun Technol, DK-9220 Aalborg, Denmark
来源
关键词
complex exponentials; perceptually relevant sinusoids; psycho-acoustical distortion measure; sinusoidal modeling; speech and audio processing; subspace-based signal analysis;
D O I
10.1109/TSA.2003.819948
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The problem of modeling a signal segment as a sum of exponentially damped sinusoidal components arises in many different application areas, including speech and audio processing. Often, model parameters are estimated using subspace based techniques which arrange the input signal in a structured matrix and exploit the so-called shift-invariance property related to certain vector spaces of the input matrix. A problem with this class of estimation algorithms, when used for speech and audio processing, is that the perceptual importance of the sinusoidal components is not taken into account. In this work we propose a solution to this problem. In particular, we show how to combine well-known subspace based estimation techniques with a recently developed perceptual distortion measure, in order to obtain,an algorithm for extracting perceptually relevant model components. In analysis-synthesis experiments with wideband audio signals, objective and subjective evaluations show that the proposed Algorithm improves perceived signal quality considerable over traditional subspace based analysis methods.
引用
收藏
页码:121 / 132
页数:12
相关论文
共 50 条
  • [1] Perceptual audio modeling with exponentially damped sinusoids
    Hermus, K
    Verhelst, W
    Lemmerling, P
    Wambacq, P
    Van Huffel, S
    SIGNAL PROCESSING, 2005, 85 (01) : 163 - 176
  • [2] A perceptual subspace method for sinusoidal speech and audio modeling
    Jensen, J
    Heusdens, R
    Jensen, SH
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 401 - 404
  • [3] Audio transients modeling by damped & delayed sinusoids (DDS)
    Boyer, R
    Abed-Meraim, K
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1729 - 1732
  • [4] Psycho-acoustic modeling of audio with exponentially damped sinusoids
    Hermus, K
    Verhelst, W
    Wambacq, P
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1821 - 1824
  • [5] A perceptual subspace approach for speech enhancement
    Ghaemi Sardaroudi S.
    Geravanchizadeh M.
    2010 5th International Symposium on Telecommunications, IST 2010, 2010, : 878 - 881
  • [6] Modeling audio with damped sinusoids using total least squares algorithms
    Verhelst, W
    Hermus, K
    Lemmerling, P
    Wambacq, P
    Van Huffel, S
    TOTAL LEAST SQUARES AND ERRORS-IN-VARIABLES MODELING: ANALYSIS, ALGORITHMS AND APPLICATIONS, 2002, : 331 - 340
  • [7] Subspace approach for two-dimensional parameter estimation of multiple damped sinusoids
    Chan, Frankie K. W.
    So, H. C.
    Sun, Weize
    SIGNAL PROCESSING, 2012, 92 (09) : 2172 - 2179
  • [8] Parametric Audio Coding With Exponentially Damped Sinusoids
    Derrien, Olivier
    Badeau, Roland
    Richard, Gael
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07): : 1489 - 1501
  • [9] Speech synthesis using damped sinusoids
    Hillenbrand, JM
    Houde, RA
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2002, 45 (04): : 639 - 650
  • [10] Perceptual sparse modeling of wideband speech signals
    Bouchhima, Bochra
    Alaoune, Monia Turki Hadj
    Amara, Rim
    2017 IEEE 19TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2017,