Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments

被引:87
|
作者
Ma, Ning [1 ]
May, Tobias [2 ]
Brown, Guy J. [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Tech Univ Denmark, Hearing Syst Grp, DK-2800 Lyngby, Denmark
关键词
Binaural sound source localisation; deep neural networks; head movements; machine hearing; multi-conditional training; reverberation; PROBABILISTIC MODEL; CUES;
D O I
10.1109/TASLP.2017.2750760
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for robust binaural localization of multiple sources in reverberant environments. DNNs are used to learn the relationship between the source azimuth and binaural cues, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs). In contrast to many previous binaural hearing systems, the proposed approach is not restricted to localization of sound sources in the frontal hemifield. Due to the similarity of binaural cues in the frontal and rear hemifields, front-back confusions often occur. To address this, a head movement strategy is incorporated in the localization model to help reduce the front-back errors. The proposed DNN system is compared to a Gaussian-mixture-model-based system that employs interaural time differences (ITDs) and ILDs as localization features. Our experiments show that the DNN is able to exploit information in the CCF that is not available in the ITD cue, which together with head movements substantially improves localization accuracies under challenging acoustic scenarios, in which multiple talkers and room reverberation are present.
引用
下载
收藏
页码:2444 / 2453
页数:10
相关论文
共 38 条
  • [31] Person Head Detection in Multiple Scales Using Deep Convolutional Neural Networks
    Saqib, Muhammad
    Khan, Sultan Daud
    Sharma, Nabin
    Blumenstein, Michael
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [32] SOUND SOURCE LOCALIZATION BASED ON DEEP NEURAL NETWORKS WITH DIRECTIONAL ACTIVATE FUNCTION EXPLOITING PHASE INFORMATION
    Takeda, Ryu
    Komatani, Kazunori
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 405 - 409
  • [33] Analyzing Multifunctionality of Head Movements in Face-to-Face Conversations Using Deep Convolutional Neural Networks
    Otsuka, Kazuhiro
    Tsumori, Masahiro
    IEEE ACCESS, 2020, 8 : 217169 - 217195
  • [34] HeadLocNet: Deep convolutional neural networks for accurate classification and multi-landmark localization of head CTs
    Zhang, Dongqing
    Wang, Jianing
    Noble, Jack H.
    Dawant, Benoit M.
    MEDICAL IMAGE ANALYSIS, 2020, 61
  • [35] Localization of Inspection Device Along Belt Conveyors With Multiple Branches Using Deep Neural Networks
    Yasutomi, Andre Yuji
    Enoki, Hideo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) : 2921 - 2928
  • [36] DISCRIMINATIVE MULTIPLE SOUND SOURCE LOCALIZATION BASED ON DEEP NEURAL NETWORKS USING INDEPENDENT LOCATION MODEL
    Takeda, Ryu
    Komatani, Kazunori
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 603 - 609
  • [37] RAID: Robust and Interpretable Daily Peak Load Forecasting via Multiple Deep Neural Networks and Shapley Values
    Jang, Joohyun
    Jeong, Woonyoung
    Kim, Sangmin
    Lee, Byeongcheon
    Lee, Miyoung
    Moon, Jihoon
    SUSTAINABILITY, 2023, 15 (08)
  • [38] Traction Optimization for Robust Navigation in Unstructured Environments Using Deep Neural Networks on the Example of the Off-Road Truck Unimog
    Wolf, Patrick
    Deoli, Pankaj
    Thangellapally, Satish Kumar
    Berns, Karsten
    INTELLIGENT AUTONOMOUS SYSTEMS 17, IAS-17, 2023, 577 : 561 - 579