Binaural segregation in multisource reverberant environments

被引:25
|
作者
Roman, Nicoleta [1 ]
Srinivasan, Soundararajan
Wang, DeLiang
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Biomed Engn, Columbus, OH 43210 USA
[3] Ohio State Univ, Ctr Cognit Sci, Columbus, OH 43210 USA
来源
基金
美国国家科学基金会;
关键词
D O I
10.1121/1.2355480
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In a natural environment, speech signals are degraded by both reverberation and concurrent noise sources. While human listening is robust under these conditions using only two, ears, current two-microphone algorithms perform poorly. The psychological process of figure-ground segregation suggests that the target signal is perceived as a foreground while the remaining stimuli are perceived as a background. Accordingly, the goal is to estimate an ideal time-frequency (T-F) binary mask, which selects the target if it is stronger than the interference in a local T-F unit. In this paper, a binaural segregation system that extracts the reverberant target signal from multisource reverberant mixtures by utilizing only the location information of target source is proposed. The proposed system combines target cancellation through adaptive filtering and a binary decision rule to estimate the ideal T-F binary mask. The main observation in this work is that the target attenuation in a T-F unit resulting from adaptive filtering is correlated with the relative strength of target to mixture. A comprehensive evaluation shows that the proposed,system results in large SNR gains. In addition, comparisons using SNR as well as automatic speech recognition measures show that this system outperforms standard two-microphone beamforming approaches and a recent binaural processor. (c) 2006 Acoustical Society of America.
引用
收藏
页码:4040 / 4051
页数:12
相关论文
共 50 条
  • [1] Binaural sound segregation for multisource reverberant environments
    Roman, N
    Wang, DL
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING SIGNAL PROCESSING THEORY AND METHODS, 2004, : 373 - 376
  • [2] Speech recognition in multisource reverberant environments with binaural inputs
    Roman, Nicoleta
    Srinivasan, Soundararajan
    Wang, DeLiang
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 309 - 312
  • [3] Binaural cues for fragment-based speech recognition in reverberant multisource environments
    Ma, Ning
    Barker, Jon
    Christensen, Heidi
    Green, Phil
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1668 - 1671
  • [4] Binaural Detection, Localization, and Segregation in Reverberant Environments Based on Joint Pitch and Azimuth Cues
    Woodruff, John
    Wang, DeLiang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 806 - 815
  • [5] ON THE ROLE OF LOCALIZATION CUES IN BINAURAL SEGREGATION OF REVERBERANT SPEECH
    Woodruff, John
    Wang, DeLiang
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 2205 - +
  • [6] A DNN Parameter Mask for the Binaural Reverberant Speech Segregation
    Jiang, Yi
    Li, Wei
    Zu, Yuanyuan
    Liu, Runsheng
    Ma, Chao
    [J]. 2016 9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016), 2016, : 959 - 963
  • [7] Combining Monaural and Binaural Evidence for Reverberant Speech Segregation
    Woodruff, John
    Prabhavalkar, Rohit
    Fosler-Lussier, Eric
    Wang, DeLiang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 406 - 409
  • [8] Binaural hearing computer models in multisource environments
    Ivancevic, Bojan
    Jambrosic, Kristian
    Petosic, Antonio
    [J]. ICECom 2005: 18th International Conference on Applied Electromagnetics and Communications, Conference Proceedings, 2005, : 471 - 474
  • [9] Cepstrum Prefiltering for Binaural Source Localization in Reverberant Environments
    Parisi, Raffaele
    Camoes, Flavia
    Scarpiniti, Michele
    Uncini, Aurelio
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (02) : 99 - 102
  • [10] Binaural Localization of Multiple Sources in Reverberant and Noisy Environments
    Woodruff, John
    Wang, DeLiang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (05): : 1503 - 1512