Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks

被引:1
|
作者
Salvati, Daniele [1 ]
Drioli, Carlo [1 ]
Foresti, Gian Luca [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy
关键词
Convolutional neural network; end-to-end system; raw waveform; speaker identification; speaker localization; two-microphone array; ACOUSTIC SOURCE LOCALIZATION; NOISY;
D O I
10.1109/ijcnn48605.2020.9206674
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an end-to-end scheme based on convolutional neural networks (CNNs) for speaker joint identification and localization. We investigate the possibility to estimate both the direction of arrival (DOA) and the identity of the speaker in far-field noisy and reverberant conditions using a two-channel microphone array. The proposed CNN network is designed to map the raw waveform of the two channels into the speaker identity and into the DOA of its speech signal. We analyze the identification and localization performance with simulated experiments in noisy and reverberation conditions.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] End-to-End Speaker Identification in Noisy and Reverberant Environments Using Raw Waveform Convolutional Neural Networks
    Salvati, Daniele
    Drioli, Carlo
    Foresti, Gian Luca
    INTERSPEECH 2019, 2019, : 4335 - 4339
  • [2] End-to-end face parsing via interlinked convolutional neural networks
    Zi Yin
    Valentin Yiu
    Xiaolin Hu
    Liang Tang
    Cognitive Neurodynamics, 2021, 15 : 169 - 179
  • [3] End-to-end face parsing via interlinked convolutional neural networks
    Yin, Zi
    Yiu, Valentin
    Hu, Xiaolin
    Tang, Liang
    COGNITIVE NEURODYNAMICS, 2021, 15 (01) : 169 - 179
  • [4] Leukocyte Segmentation via End-to-End Learning of Deep Convolutional Neural Networks
    Lu, Yan
    Fan, Haoyi
    Li, Zuoyong
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: VISUAL DATA ENGINEERING, PT I, 2019, 11935 : 191 - 200
  • [5] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Junho Jo
    Hyung Il Koo
    Jae Woong Soh
    Nam Ik Cho
    Multimedia Tools and Applications, 2020, 79 : 32137 - 32150
  • [6] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Jo, Junho
    Koo, Hyung Il
    Soh, Jae Woong
    Cho, Nam Ik
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32137 - 32150
  • [7] End-to-End Text Recognition with Convolutional Neural Networks
    Wang, Tao
    Wu, David J.
    Coates, Adam
    Ng, Andrew Y.
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3304 - 3308
  • [8] Improved Relation Networks for End-to-End Speaker Verification and Identification
    Chaubey, Ashutosh
    Sinha, Sparsh
    Ghose, Susmita
    INTERSPEECH 2022, 2022, : 5085 - 5089
  • [9] An optimum end-to-end text-independent speaker identification system using convolutional neural network
    Farsiani, Shabnam
    Izadkhah, Habib
    Lotfi, Shahriar
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
  • [10] An End-to-end Approach to Language Identification in Short Utterances using Convolutional Neural Networks
    Lozano-Diez, Alicia
    Zazo-Candil, Ruben
    Gonzalez-Dominguez, Javier
    Toledano, Doroteo T.
    Gonzalez-Rodriguez, Joaquin
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 403 - 407