Binaural sound localization based on deep neural network and affinity propagation clustering in mismatched HRTF condition

被引:0
|
作者
Jing Wang
Jin Wang
Kai Qian
Xiang Xie
Jingming Kuang
机构
[1] Beijing Institute of Technology,
关键词
Deep neural network; Clustering; Affinity propagation; Binaural localization;
D O I
暂无
中图分类号
学科分类号
摘要
Binaural sound source localization is an important and widely used perceptually based method and it has been applied to machine learning studies by many researchers based on head-related transfer function (HRTF). Because the HRTF is closely related to human physiological structure, the HRTFs vary between individuals. Related machine learning studies to date tend to focus on binaural localization in reverberant or noisy environments, or in conditions with multiple simultaneously active sound sources. In contrast, mismatched HRTF condition, in which the HRTFs used to generate the training and test sets are different, is rarely studied. This mismatch leads to a degradation of localization performance. A basic solution to this problem is to introduce more data to improve generalization performance, which requires a lot. However, simply increasing the data volume will result in data-inefficiency. In this paper, we propose a data-efficient method based on deep neural network (DNN) and clustering to improve binaural localization performance in the mismatched HRTF condition. Firstly, we analyze the relationship between binaural cues and the sound source localization with a classification DNN. Different HRTFs are used to generate training and test sets, respectively. On this basis, we study the localization performance of DNN model trained by each training set on different test sets. The result shows that the localization performance of the same model on different test sets is different, while the localization performance of different models on the same test set may be similar. The result also shows a clustering trend. Secondly, different HRTFs are divided into several clusters. Finally, the corresponding HRTFs of each cluster center are selected to generate a new training set and to train a more generalized DNN model. The experimental results show that the proposed method achieves better generalization performance than the baseline methods in the mismatched HRTF condition and has almost equal performance to the DNN trained with a large number of HRTFs, which means the proposed method is data-efficient.
引用
收藏
相关论文
共 50 条
  • [21] Energy consumption monitoring of the steam pipe network based on affinity propagation clustering
    You Xiazhu
    Du Wenli
    Zhao Liang
    Qian Feng
    PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 3364 - 3368
  • [22] Affinity Propagation-based Probability Neural Network Structure Optimization
    Xie, Yingjuan
    Fan, Xinnan
    Chen, Junfeng
    2014 TENTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2014, : 85 - 89
  • [23] ANOMALOUS SOUND DETECTION BASED ON INTERPOLATION DEEP NEURAL NETWORK
    Suefusa, Kaori
    Nishida, Tomoya
    Purohit, Harsh
    Tanabe, Ryo
    Endo, Takashi
    Kawaguchi, Yohei
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 271 - 275
  • [25] NEURAL NETWORK MODELS OF SOUND LOCALIZATION BASED ON DIRECTIONAL FILTERING BY THE PINNA
    NETI, C
    YOUNG, ED
    SCHNEIDER, MH
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 92 (06): : 3140 - 3156
  • [26] Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization
    Itoyama, Katsutoshi
    Morimoto, Yoshiya
    Masaki, Shungo
    Kojima, Ryosuke
    Nishida, Kenji
    Nakadai, Kazuhiro
    INTERSPEECH 2021, 2021, : 2152 - 2156
  • [27] A SMARTPHONE INDOOR LOCALIZATION BASED ON AFFINITY PROPAGATION CLUSTERING AND KULLBACK-LEIBLER MULTIVARIATE GAUSSIAN
    Abdullah, Osamah A.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2021, 16 (01): : 652 - 665
  • [28] DeepLoc: Deep Neural Network-based Telco Localization
    Zhang, Yige
    Xiao, Yu
    Zhao, Kai
    Rao, Weixiong
    PROCEEDINGS OF THE 16TH EAI INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING AND SERVICES (MOBIQUITOUS'19), 2019, : 258 - 267
  • [29] Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks
    Pak, Junhyeong
    Shin, Jong Won
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (08) : 1335 - 1345
  • [30] DEEP NEURAL NETWORK BASED MATRIX COMPLETION FOR INTERNET OF THINGS NETWORK LOCALIZATION
    Kim, Sunwoo
    Luong Trung Nguyen
    Shim, Byonghyo
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3427 - 3431