Binaural sound localization based on deep neural network and affinity propagation clustering in mismatched HRTF condition

被引:0
|
作者
Jing Wang
Jin Wang
Kai Qian
Xiang Xie
Jingming Kuang
机构
[1] Beijing Institute of Technology,
关键词
Deep neural network; Clustering; Affinity propagation; Binaural localization;
D O I
暂无
中图分类号
学科分类号
摘要
Binaural sound source localization is an important and widely used perceptually based method and it has been applied to machine learning studies by many researchers based on head-related transfer function (HRTF). Because the HRTF is closely related to human physiological structure, the HRTFs vary between individuals. Related machine learning studies to date tend to focus on binaural localization in reverberant or noisy environments, or in conditions with multiple simultaneously active sound sources. In contrast, mismatched HRTF condition, in which the HRTFs used to generate the training and test sets are different, is rarely studied. This mismatch leads to a degradation of localization performance. A basic solution to this problem is to introduce more data to improve generalization performance, which requires a lot. However, simply increasing the data volume will result in data-inefficiency. In this paper, we propose a data-efficient method based on deep neural network (DNN) and clustering to improve binaural localization performance in the mismatched HRTF condition. Firstly, we analyze the relationship between binaural cues and the sound source localization with a classification DNN. Different HRTFs are used to generate training and test sets, respectively. On this basis, we study the localization performance of DNN model trained by each training set on different test sets. The result shows that the localization performance of the same model on different test sets is different, while the localization performance of different models on the same test set may be similar. The result also shows a clustering trend. Secondly, different HRTFs are divided into several clusters. Finally, the corresponding HRTFs of each cluster center are selected to generate a new training set and to train a more generalized DNN model. The experimental results show that the proposed method achieves better generalization performance than the baseline methods in the mismatched HRTF condition and has almost equal performance to the DNN trained with a large number of HRTFs, which means the proposed method is data-efficient.
引用
收藏
相关论文
共 50 条
  • [1] Binaural sound localization based on deep neural network and affinity propagation clustering in mismatched HRTF condition
    Wang, Jing
    Wang, Jin
    Qian, Kai
    Xie, Xiang
    Kuang, Jingming
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
  • [2] Binaural Sound Source Localization Based on Convolutional Neural Network
    Zhou, Lin
    Ma, Kangyu
    Wang, Lijie
    Chen, Ying
    Tang, Yibin
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (02): : 545 - 557
  • [3] Binaural sound localization in an artificial neural network
    Schauer, C
    Zahn, T
    Paschke, P
    Gross, HM
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 865 - 868
  • [4] Fingerprinting Localization Based on Affinity Propagation Clustering and Artificial Neural Networks
    Ding, Genming
    Tan, Zhenhui
    Zhang, Jinbao
    Zhang, Lingwen
    2013 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2013, : 2317 - 2322
  • [5] An artificial neural network for sound localization using binaural cues
    Datum, MS
    Palmieri, F
    Moiseff, A
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (01): : 372 - 383
  • [7] A Binaural Sound Localization System using Deep Convolutional Neural Networks
    Xu, Ying
    Afshar, Saeed
    Singh, Ram Kuber
    Wang, Runchun
    van Schaik, Andre
    Hamilton, Tara Julia
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [8] A novel neural network classification model based on covering and Affinity Propagation clustering algorithm
    Li, Hui
    Ding, Shifei
    Journal of Computational Information Systems, 2013, 9 (07): : 2565 - 2573
  • [9] Sound Source Localization Based on von-Mises-Bernoulli Deep Neural Network
    Nakadai, Kazuhiro
    Masaki, Shungo
    Kojima, Ryosuke
    Sugiyama, Osamu
    Itoyama, Katsutoshi
    Nishida, Kenji
    2020 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2020, : 658 - 663
  • [10] A matrix modular neural network based on task decomposition with subspace division by adaptive affinity propagation clustering
    Zhao, Zhong-Qiu
    Gao, Jun
    Glotin, Herve
    Wu, Xindong
    APPLIED MATHEMATICAL MODELLING, 2010, 34 (12) : 3884 - 3895