Computational speech segregation based on an auditory-inspired modulation analysis

被引：15

作者：

May, Tobias ^{[1
]}

Dau, Torsten ^{[1
]}

机构：

[1] Tech Univ Denmark, Dept Elect Engn, Ctr Appl Hearing Res, DK-2800 Lyngby, Denmark

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2014年 / 136卷 / 06期

关键词：

INNER HAIR-CELL; AMPLITUDE-MODULATION; FREQUENCY-SELECTIVITY; NOISE; INTELLIGIBILITY; MODEL; HEARING; RECOGNITION; PERCEPTION; MASKING;

D O I：

10.1121/1.4901711

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A monaural speech segregation system is presented that estimates the ideal binary mask from noisy speech based on the supervised learning of amplitude modulation spectrogram (AMS) features. Instead of using linearly scaled modulation filters with constant absolute bandwidth, an auditory-inspired modulation filterbank with logarithmically scaled filters is employed. To reduce the dependency of the AMS features on the overall background noise level, a feature normalization stage is applied. In addition, a spectro-temporal integration stage is incorporated in order to exploit the context information about speech activity present in neighboring time-frequency units. In order to evaluate the generalization performance of the system to unseen acoustic conditions, the speech segregation system is trained with a limited set of low signal-to-noise ratio (SNR) conditions, but tested over a wide range of SNRs up to 20 dB. A systematic evaluation of the system demonstrates that auditory-inspired modulation processing can substantially improve the mask estimation accuracy in the presence of stationary and fluctuating interferers. (C) 2014 Acoustical Society of America.

引用

页码：3350 / 3359

页数：10

共 50 条

[1] Computational speech segregation based on an auditory-inspired modulation analysis
[J]. May, Tobias, 1600, Acoustical Society of America (136):
[2] Whispered Speech Detection in Noise Using Auditory-Inspired Modulation Spectrum Features
Sarria-Paja, Milton
Falk, Tiago H.
[J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (08) : 783 - 786
[3] Improved monaural speech segregation based on computational auditory scene analysis
Wang Yu
Lin Jiajun
Chen Ning
Yuan Wenhao
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2013
[4] Improved monaural speech segregation based on computational auditory scene analysis
Wang Yu
Lin Jiajun
Chen Ning
Yuan Wenhao
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
[5] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement
Cadore, Joyner
Valverde-Albacete, Francisco J.
Gallardo-Antolin, Ascension
Pelaez-Moreno, Carmen
[J]. COGNITIVE COMPUTATION, 2013, 5 (04) : 426 - 441
[6] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement
Joyner Cadore
Francisco J. Valverde-Albacete
Ascensión Gallardo-Antolín
Carmen Peláez-Moreno
[J]. Cognitive Computation, 2013, 5 : 426 - 441
[7] A computational auditory scene analysis system for speech segregation and robust speech recognition
Shao, Yang
Srinivasan, Soundararajan
Jin, Zhaozhang
Wang, DeLiang
[J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 77 - 93
[8] Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario
Biesmans, Wouter
Das, Neetha
Francart, Tom
Bertrand, Alexander
[J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2017, 25 (05) : 402 - 412
[9] Auditory-inspired sparse representation of audio signals
Pichevar, Ramin
Najaf-Zadeh, Hossein
Thibault, Louis
Landili, Hassan
[J]. SPEECH COMMUNICATION, 2011, 53 (05) : 643 - 657
[10] Auditory-Inspired Heart Sound Temporal Analysis for Patent Ductus Arteriosus
Sung, Po-Hsun
Wang, Jieh-Neng
Chen, Bo-Wei
Jangand, Ling-Sheng
Wang, Jhing-Fa
[J]. 1ST INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT 2013), 2013, : 231 - 234

← 1 2 3 4 5 →