AN END-TO-END DEEP LEARNING FRAMEWORK FOR MULTIPLE AUDIO SOURCE SEPARATION AND LOCALIZATION

被引:6
|
作者
Chen, Yu [1 ]
Liu, Bowen [1 ]
Zhang, Zijian [1 ]
Kim, Hun-Seok [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
Multiple audio source localization; audio source separation; deep learning; discriminator; ACOUSTIC SOURCE LOCALIZATION;
D O I
10.1109/ICASSP43922.2022.9746950
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sound source separation and localization for situational awareness enables a wide range of applications such as hearing enhancement and audio beam-forming. We present an end-to-end deep learning framework to separate and localize multiple audio sources from the mixture of multi-channels. The proposed framework jointly estimates the separated sources and their time difference of arrival (TDOA) at different microphones, then it obtains the direction-of-arrival (DOA) for each source. A new structure to reconstruct the mixed signal is introduced for joint optimization of source separation and TDOA estimation. In addition, a discriminator network is added during the training phase to further improve the separation quality. Experiment results demonstrate that the proposed method achieves state-of-the-art accuracy on source separation as well as DOA estimation.
引用
收藏
页码:736 / 740
页数:5
相关论文
共 50 条
  • [1] LIGHTWEIGHT END-TO-END DEEP LEARNING MODEL FOR MUSIC SOURCE SEPARATION
    Wang, Yao-Ting
    Lin, Yi-Xing
    Liang, Kai-Wen
    Tai, Tzu-Chiang
    Wang, Jia-Ching
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 315 - 318
  • [2] Direction specific ambisonics source separation with end-to-end deep learning
    Lluis, Francesc
    Meyer-Kahlen, Nils
    Chatziioannou, Vasileios
    Hofmann, Alex
    [J]. ACTA ACUSTICA, 2023, 7
  • [3] Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates
    Manuel Vera-Diaz, Juan
    Pizarro, Daniel
    Macias-Guarasa, Javier
    [J]. SENSORS, 2018, 18 (10)
  • [4] End-to-end Optimization of Source Models for Speech and Audio Coding Using a Machine Learning Framework
    Backstrom, Tom
    [J]. INTERSPEECH 2019, 2019, : 3401 - 3405
  • [5] An End-to-End Transfer Learning Framework of Source Recording Device Identification for Audio Sustainable Security
    Wang, Zhifeng
    Zhan, Jian
    Zhang, Guozhong
    Ouyang, Daliang
    Guo, Huaiyong
    [J]. SUSTAINABILITY, 2023, 15 (14)
  • [6] Robotic Odor Source Localization via End-to-End Recurrent Deep Reinforcement Learning
    Wang, Lingxiao
    Pang, Shuo
    [J]. 2023 SEVENTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC 2023, 2023, : 43 - 50
  • [7] END-TO-END LEARNING FOR MUSIC AUDIO
    Dieleman, Sander
    Schrauwen, Benjamin
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] End-to-end Visual-guided Audio Source Separation with Enhanced Losses
    Pham, Duc-Huy
    Do, Quang-Anh
    Duong, Thanh Thi-Hien
    Le, Thi-Lan
    Nguyen, Phi-Le
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 2022 - 2028
  • [9] End-to-end deep learning framework for digital holographic reconstruction
    Zhenbo Ren
    Zhimin Xu
    Edmund Y.Lam
    [J]. Advanced Photonics, 2019, 1 (01) : 76 - 87
  • [10] End-to-end deep learning framework for digital holographic reconstruction
    Ren, Zhenbo
    Xu, Zhimin
    Lam, Edmund Y.
    [J]. ADVANCED PHOTONICS, 2019, 1 (01):