Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks

被引:1
|
作者
Salvati, Daniele [1 ]
Drioli, Carlo [1 ]
Foresti, Gian Luca [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy
关键词
Convolutional neural network; end-to-end system; raw waveform; speaker identification; speaker localization; two-microphone array; ACOUSTIC SOURCE LOCALIZATION; NOISY;
D O I
10.1109/ijcnn48605.2020.9206674
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an end-to-end scheme based on convolutional neural networks (CNNs) for speaker joint identification and localization. We investigate the possibility to estimate both the direction of arrival (DOA) and the identity of the speaker in far-field noisy and reverberant conditions using a two-channel microphone array. The proposed CNN network is designed to map the raw waveform of the two channels into the speaker identity and into the DOA of its speech signal. We analyze the identification and localization performance with simulated experiments in noisy and reverberation conditions.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Jasper: An End-to-End Convolutional Neural Acoustic Model
    Li, Jason
    Lavrukhin, Vitaly
    Ginsburg, Boris
    Leary, Ryan
    Kuchaiev, Oleksii
    Cohen, Jonathan M.
    Nguyen, Huyen
    Gadde, Ravi Teja
    INTERSPEECH 2019, 2019, : 71 - 75
  • [42] Robust End-to-end Speaker Diarization with Generic Neural Clustering
    Yang, Chenyu
    Wang, Yu
    INTERSPEECH 2022, 2022, : 1471 - 1475
  • [43] End-To-End Phonetic Neural Network Approach for Speaker Verification
    Demirbag, Sedat
    Erden, Mustafa
    Arslan, Levent
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [44] END-TO-END NEURAL SPEAKER DIARIZATION WITH SELF-ATTENTION
    Fujita, Yusuke
    Kanda, Naoyuki
    Horiguchi, Shota
    Xue, Yawen
    Nagamatsu, Kenji
    Watanabe, Shinji
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 296 - 303
  • [45] End-to-End Audio-Visual Neural Speaker Diarization
    He, Mao-kui
    Du, Jun
    Lee, Chin-Hui
    INTERSPEECH 2022, 2022, : 1461 - 1465
  • [46] End-to-end recognition of slab identification numbers using a deep convolutional neural network
    Lee, Sang Jun
    Yun, Jong Pil
    Koo, Gyogwon
    Kim, Sang Woo
    KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 1 - 10
  • [47] End-to-end video background subtraction with 3d convolutional neural networks
    Sakkos, Dimitrios
    Liu, Heng
    Han, Jungong
    Shao, Ling
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 23023 - 23041
  • [48] Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images
    Pinckaers, Hans
    van Ginneken, Bram
    Litjens, Geert
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) : 1581 - 1590
  • [49] END-TO-END PHOTOPLETHYSMOGRAPHY (PPG) BASED BIOMETRIC AUTHENTICATION BY USING CONVOLUTIONAL NEURAL NETWORKS
    Luque, Jordi
    Cortes, Guillem
    Segura, Carlos
    Maravilla, Alexandre
    Esteban, Javier
    Fabregat, Joan
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 538 - 542
  • [50] End-to-End Single Image Super-Resolution Based on Convolutional Neural Networks
    Ferariu, Lavinia
    Beti, Iosif-Alin
    2022 26TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2022, : 277 - 282