Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks

被引:1
|
作者
Salvati, Daniele [1 ]
Drioli, Carlo [1 ]
Foresti, Gian Luca [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy
来源
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年
关键词
Convolutional neural network; end-to-end system; raw waveform; speaker identification; speaker localization; two-microphone array; ACOUSTIC SOURCE LOCALIZATION; NOISY;
D O I
10.1109/ijcnn48605.2020.9206674
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an end-to-end scheme based on convolutional neural networks (CNNs) for speaker joint identification and localization. We investigate the possibility to estimate both the direction of arrival (DOA) and the identity of the speaker in far-field noisy and reverberant conditions using a two-channel microphone array. The proposed CNN network is designed to map the raw waveform of the two channels into the speaker identity and into the DOA of its speech signal. We analyze the identification and localization performance with simulated experiments in noisy and reverberation conditions.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
    Parcollet, Titouan
    Zhang, Ying
    Morchid, Mohamed
    Trabelsi, Chiheb
    Linares, Georges
    De Mori, Renato
    Bengio, Yoshua
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
  • [22] End-to-End Blood Pressure Prediction via Fully Convolutional Networks
    Baek, Sanghyun
    Jang, Jiyong
    Yoon, Sungroh
    IEEE ACCESS, 2019, 7 : 185458 - 185468
  • [23] An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks
    Rai, Aashish
    Karnani, Rashmi
    Chudasama, Vishal
    Upla, Kishor
    2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
  • [24] TOWARDS END-TO-END SPEAKER DIARIZATION WITH GENERALIZED NEURAL SPEAKER CLUSTERING
    Zhang, Chunlei
    Shi, Jiatong
    Weng, Chao
    Yu, Meng
    Yu, Dong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8372 - 8376
  • [25] Two-Stage Transfer Learning of End-to-End Convolutional Neural Networks for Webpage Saliency Prediction
    Shan, Wei
    Sun, Guangling
    Zhou, Xiaofei
    Liu, Zhi
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, ISCIDE 2017, 2017, 10559 : 316 - 324
  • [26] Neural PLDA Modeling for End-to-End Speaker Verification
    Ramoji, Shreyas
    Krishnan, Prashant
    Ganapathy, Sriram
    INTERSPEECH 2020, 2020, : 4333 - 4337
  • [27] Image Shadow Removal Using End-To-End Deep Convolutional Neural Networks
    Fan, Hui
    Han, Meng
    Li, Jinjiang
    APPLIED SCIENCES-BASEL, 2019, 9 (05):
  • [28] End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments
    Zinemanas, Pablo
    Cancela, Pablo
    Rocamora, Martin
    PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 533 - 539
  • [29] CONVOLUTIONAL ANALYSIS OPERATOR LEARNING BY END-TO-END TRAINING OF ITERATIVE NEURAL NETWORKS
    Kofler, Andreas
    Wald, Christian
    Schaeffter, Tobias
    Haltmeier, Markus
    Kolbitsch, Christoph
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [30] Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks
    Zhang, Wei
    Zhai, Minghao
    Huang, Zilong
    Liu, Chen
    Li, Wei
    Cao, Yi
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PART VI, 2019, 11745 : 332 - 341