SALADNET: SELF-ATTENTIVE MULTISOURCE LOCALIZATION IN THE AMBISONICS DOMAIN

被引:8
|
作者
Grumiaux, Pierre-Amaury [1 ]
Kitic, Srdan [1 ]
Srivastava, Prerak [2 ]
Girin, Laurent [3 ]
Guerin, Alexandre [1 ]
机构
[1] Orange Labs, Cesson Sevigne, France
[2] Univ Lorraine, INRIA, Nancy, France
[3] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
关键词
Sound source localization; neural networks; self-attention; Ambisonics; parallel computing;
D O I
10.1109/WASPAA52581.2021.9632737
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data, with up to 3 simultaneous speakers. The obtained results indicate that the majority of the proposed architectures either perform on par, or outperform the CRNN baseline, especially in the multisource scenario. Moreover, by avoiding the recurrent layers, the proposed models lend themselves to parallel computing, which is shown to produce considerable savings in execution time.
引用
收藏
页码:336 / 340
页数:5
相关论文
共 50 条
  • [21] Deep Fourier Kernel for Self-Attentive Point Processes
    Zhu, Shixiang
    Zhang, Minghe
    Ding, Ruyi
    Xie, Yao
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [22] Self-attentive Rationalization for Interpretable Graph Contrastive Learning
    Li, Sihang
    Luo, Yanchen
    Zhang, An
    Wang, Xiang
    Li, Longfei
    Zhou, Jun
    Chua, Tat-seng
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2025, 19 (02)
  • [23] MSAM: Cross-Domain Recommendation Based on Multi-Layer Self-Attentive Mechanism
    Song, XiaoBing
    Bao, JiaYu
    Di, Yicheng
    Li, Yuan
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 319 - 332
  • [24] Self-Attentive Similarity Measurement Strategies in Speaker Diarization
    Lin, Qingjian
    Hou, Yu
    Li, Ming
    INTERSPEECH 2020, 2020, : 284 - 288
  • [25] Explicit Sparse Self-Attentive Network for CTR Prediction
    Luo, Yu
    Peng, Wanwan
    Fan, Youping
    Pang, Hong
    Xu, Xiang
    Wu, Xiaohua
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY, 2021, 183 : 690 - 695
  • [26] Improving Disfluency Detection by Self-Training a Self-Attentive Model
    Lou, Paria Jamshid
    Johnson, Mark
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3754 - 3763
  • [27] A self-attentive model for tracing knowledge and engagement in parallel
    Jiang, Hua
    Xiao, Bing
    Luo, Yintao
    Ma, Junliang
    PATTERN RECOGNITION LETTERS, 2023, 165 : 25 - 32
  • [28] Gated Self-attentive Encoder for Neural Machine Translation
    Wei, Xiangpeng
    Hu, Yue
    Xing, Luxi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 655 - 666
  • [29] Locker: Locally Constrained Self-Attentive Sequential Recommendation
    He, Zhankui
    Zhao, Handong
    Wang, Zhaowen
    Lin, Zhe
    Kale, Ajinkya
    McAuley, Julian
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3088 - 3092
  • [30] MeSHProbeNet: a self-attentive probe net for MeSH indexing
    Xun, Guangxu
    Jha, Kishlay
    Yuan, Ye
    Wang, Yaqing
    Zhang, Aidong
    BIOINFORMATICS, 2019, 35 (19) : 3794 - 3802