Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

被引:22
|
作者
He, Weipeng [1 ,2 ]
Motlicek, Petr [1 ]
Odobez, Jean-Marc [1 ,2 ]
机构
[1] Idiap Res Inst, CH-1920 Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
基金
欧盟地平线“2020”;
关键词
Data models; Adaptation models; Direction-of-arrival estimation; Neural networks; Location awareness; Data collection; Robots; DOA estimation; data augmentation; sound source localization; weakly-supervised learning; LOCALIZATION; CLASSIFICATION;
D O I
10.1109/TASLP.2021.3060257
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural networks have been successfully applied to sound direction-of-arrival estimation under challenging conditions. However, such a learning-based approach requires a large amount of labeled training data, which is difficult to acquire. To address this problem, we propose a novel approach for multi-speaker direction-of-arrival estimation with data augmentation and weakly-supervised domain adaptation. We generate source domain data with simulation, and collect real data annotated with the number of sound sources as the weak labels. The real data are further augmented by mixing single-source segments. Then, weakly-supervised domain adaptation is applied to models pre-trained on the simulated data. We define a loss function for the adaptation process which exploits the weak labels and the mixture component information in the augmented data. Experiments with real robot audio data show that our proposed approach achieves similar performance as if the fully-labeled real data are used. This paper suggests an effective development procedure for DOA estimation models applied to new types of microphone arrays with minimal data collection efforts.
引用
收藏
页码:1303 / 1317
页数:15
相关论文
共 50 条
  • [1] Multi-speaker Direction of Arrival Estimation Using Audio and Visual Modalities with Convolutional Neural Network
    Wu, Yulin
    Hu, Ruimin
    Wang, Xiaochen
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 636 - 641
  • [2] A feedforward neural network for direction-of-arrival estimation
    Ozanich, Emma
    Gerstoft, Peter
    Niu, Haiqiang
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 147 (03): : 2035 - 2048
  • [3] MAXIMUM LIKELIHOOD MULTI-SPEAKER DIRECTION OF ARRIVAL ESTIMATION UTILIZING A WEIGHTED HISTOGRAM
    Hadad, Elior
    Gannot, Sharon
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 586 - 590
  • [4] A modular neural network for direction-of-arrival estimation of two sources
    Ofek, Gal
    Tabrikian, Joseph
    Aladjem, Mayer
    [J]. NEUROCOMPUTING, 2011, 74 (17) : 3092 - 3102
  • [5] Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles
    Feng Guo
    Yuhang Cao
    Zhaoqiong Huang
    Xing You
    Haixing Guan
    Jiaen Liang
    Baoqing Li
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 2320 - 2334
  • [6] Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles
    Guo, Feng
    Cao, Yuhang
    Huang, Zhaoqiong
    You, Xing
    Guan, Haixing
    Liang, Jiaen
    Li, Baoqing
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (05) : 2320 - 2334
  • [7] Joint estimation of pitch and direction of arrival: improving robustness and accuracy for multi-speaker scenarios
    Gerlach, Stephan
    Bitzer, Joerg
    Goetze, Stefan
    Doclo, Simon
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [8] Joint estimation of pitch and direction of arrival: improving robustness and accuracy for multi-speaker scenarios
    Stephan Gerlach
    Jörg Bitzer
    Stefan Goetze
    Simon Doclo
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014 (1)
  • [9] COMPLEX-VALUED NEURAL-NETWORK FOR DIRECTION-OF-ARRIVAL ESTIMATION
    YANG, WH
    CHAN, KK
    CHANG, PR
    [J]. ELECTRONICS LETTERS, 1994, 30 (07) : 574 - 575
  • [10] Multi-Speaker Direction of Arrival Estimation using SRP-PHAT Algorithm with a Weighted Histogram
    Hadad, Elior
    Gannot, Sharon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,