A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids

被引：27

作者：

Jensen, J ^{[1
]}

Heusdens, R

Jensen, SH

机构：

[1] Delft Univ Technol, Dept Mediamat, NL-2628 CD Delft, Netherlands

[2] Aalborg Univ, Dept Commun Technol, DK-9220 Aalborg, Denmark

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2004年 / 12卷 / 02期

关键词：

complex exponentials; perceptually relevant sinusoids; psycho-acoustical distortion measure; sinusoidal modeling; speech and audio processing; subspace-based signal analysis;

D O I：

10.1109/TSA.2003.819948

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The problem of modeling a signal segment as a sum of exponentially damped sinusoidal components arises in many different application areas, including speech and audio processing. Often, model parameters are estimated using subspace based techniques which arrange the input signal in a structured matrix and exploit the so-called shift-invariance property related to certain vector spaces of the input matrix. A problem with this class of estimation algorithms, when used for speech and audio processing, is that the perceptual importance of the sinusoidal components is not taken into account. In this work we propose a solution to this problem. In particular, we show how to combine well-known subspace based estimation techniques with a recently developed perceptual distortion measure, in order to obtain,an algorithm for extracting perceptually relevant model components. In analysis-synthesis experiments with wideband audio signals, objective and subjective evaluations show that the proposed Algorithm improves perceived signal quality considerable over traditional subspace based analysis methods.

引用

页码：121 / 132

页数：12

共 50 条

[31] ANALYSIS-SYNTHESIS OF CONNECTED SPEECH IN TERMS OF ORTHOGONALIZED EXPONENTIALLY DAMPED SINUSOIDS
MANLEY, HJ
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1963, 35 (04): : 464 - &
[32] Perceptual Models for Speech, Audio, and Music Processing
Jont B Allen
Wai-Yip Geoffrey Chan
Stephen Voran
EURASIP Journal on Audio, Speech, and Music Processing, 2007
[33] ANALYSIS-SYNTHESIS OF CONTINUOUS SPEECH IN TERMS OF ORTHOGONALIZED EXPONENTIALLY DAMPED SINUSOIDS
MANLEY, HJ
KLEIN, DB
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1962, 34 (05): : 724 - &
[34] PERCEVAL - PERCEPTUAL EVALUATION OF THE QUALITY OF AUDIO SIGNALS
PAILLARD, B
MABILLEAU, P
MORISSETTE, S
SOUMAGNE, J
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1992, 40 (1-2): : 21 - 31
[35] Coherent decompositions of power systems signals using damped sinusoids with applications to denoising
Lovisolo, L
da Silva, EAB
Rodrigues, MAM
Diniz, PSR
2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL V, PROCEEDINGS, 2002, : 685 - 688
[36] LINEAR PREDICTION APPROACH TO THE ROBUST PARAMETER ESTIMATION FOR THE DAMPED SINUSOIDS
Zhou, Zhenhua
Liu, Yong
Christensen, Mads G.
Chen, Kai
PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020), 2020, : 478 - 483
[37] EFFICIENT PARAMETER ESTIMATION OF MULTIPLE DAMPED SINUSOIDS BY COMBINING SUBSPACE AND WEIGHTED LEAST SQUARES TECHNIQUES
Sun, Weize
So, H. C.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 3509 - 3512
[38] Joint speech/audio coding based scalable perceptual audio coding
Gao, Li
Hu, Ruimin
Yang, Yuhong
2014 IEEE/ACIS 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2014, : 419 - 424
[39] Novel subspace method for frequencies estimation of two sinusoids with applications to vital signals
Chen, Yi-Sheng
Lin, Yue-Der
IET SIGNAL PROCESSING, 2017, 11 (09) : 1114 - 1121
[40] Perceptual speech modeling for noisy speech recognition
Wu, CH
Chiu, YH
Lim, H
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 385 - 388

← 1 2 3 4 5 →